Local Models with Ollama

Run OpenClaw completely offline with local language models. No API costs, maximum privacy.

TL;DR

Quick Setup: Install Ollama → Pull a model → Configure OpenClaw → Start chatting locally!

Best Models: llama3.2:3b (fast), llama3.2:8b (quality), codellama:7b (coding)

Requirements: 8GB+ RAM for 3B models, 16GB+ for 7B+ models

🔒 Why Local Models?

💰

Zero API Costs

No per-token charges. Run unlimited conversations without worrying about costs.

🔒

Complete Privacy

Your conversations never leave your machine. Perfect for sensitive or personal data.

🚀

No Latency

Responses are instant. No network delays or API rate limits.

📡

Offline Capable

Works without internet connection. Perfect for travel or unreliable networks.

🛠️ Installation

"comment"># Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

"comment"># Pull recommended models
ollama pull llama3.2:3b
ollama pull codellama:7b
ollama pull mistral:7b
"comment"># Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

"comment"># Start Ollama service
sudo systemctl enable ollama
sudo systemctl start ollama

"comment"># Pull models
ollama pull llama3.2:3b
"comment"># Install via winget
winget install Ollama.Ollama

"comment"># Or download from https://ollama.ai/download/windows

"comment"># Pull models (in Command Prompt/PowerShell)
ollama pull llama3.2:3b
💡

Hardware Requirements

Minimum: 8GB RAM for 3B models | Recommended: 16GB+ RAM for 7B+ models

GPU acceleration is automatic on Mac (Metal) and Linux (CUDA/ROCm)

🤖 Choosing Models

Different models for different needs. Start with llama3.2:3b for speed, upgrade to 8b for quality.

Model Size Use Case Speed Quality
llama3.2:3b 2.0GB General chat, lightweight tasks Fast Good
llama3.2:8b 4.7GB Better reasoning, complex tasks Medium Excellent
codellama:7b 3.8GB Code generation and analysis Medium Excellent
mistral:7b 4.1GB Multilingual, instruction following Medium Very Good
gemma2:2b 1.6GB Ultra-fast responses, basic tasks Very Fast Fair

💡 Pro Tip

Start with llama3.2:3b to test your setup. You can always download more models later with ollama pull MODEL_NAME

⚙️ Configure OpenClaw

1

Run Configuration Command

openclaw configure --section llm
2

Set Provider and Model

# Configure OpenClaw to use Ollama
openclaw configure --section llm

# Set provider to ollama
Provider: ollama
Base URL: http://localhost:11434
Model: llama3.2:3b

Use the model name exactly as downloaded (e.g., llama3.2:3b)

3

Test the Setup

openclaw chat 'Hello! Are you running locally?'

This should respond instantly without any API calls

🚀 Performance Tips

Hardware

  • 16GB+ RAM for smooth 7B+ models
  • SSD storage for faster model loading
  • GPU acceleration (automatic on supported systems)
  • Close other memory-intensive apps

Model Management

  • Use ollama list to see installed models
  • Remove unused models with ollama rm MODEL_NAME
  • Models are stored in ~/.ollama/models
  • Pre-load models with ollama run MODEL_NAME ""

Optimization

  • Set OLLAMA_NUM_PARALLEL=2 for concurrent requests
  • Use OLLAMA_MAX_LOADED_MODELS=1 to save RAM
  • Configure OLLAMA_FLASH_ATTENTION=1 for speed
  • Monitor with ollama ps

Troubleshooting

  • Check Ollama is running: curl http://localhost:11434
  • Restart Ollama service if stuck
  • Check logs: ollama logs
  • Update regularly: ollama update
🎥

Watch the Setup Tutorial

Complete video guide showing OpenClaw + Ollama setup from start to finish.

Watch on YouTube →

🎯 What's Next?

You're running completely local! Now set up channels and automation.