Ollama

Getting Started

Install Ollama from ollama.com with the one-line installer for macOS, Linux, or Windows.
Run your first model with a single command: ollama run llama3.2 to download and start chatting.
Browse available models at ollama.com/library and pull any model with ollama pull model-name.
Use the OpenAI-compatible API at localhost:11434 to integrate local models into your applications.

One-command setup downloads and runs any supported model with a single ollama run command.
Extensive model library hosts hundreds of models including Llama, Mistral, Gemma, Phi, Qwen, and DeepSeek.
OpenAI-compatible API serves models locally with an API that works as a drop-in replacement for OpenAI.
Automatic quantization optimizes models for your hardware, running efficiently on consumer GPUs and Apple Silicon.
Modelfile customization lets you create custom model configurations with system prompts, parameters, and templates.
Cross-platform support runs natively on macOS (Apple Silicon), Linux (NVIDIA/AMD), and Windows.