Your Own Local ChatGPT — With Ollama + Open WebUI 🤖✨

virtruvius 29.05.2025 Raspberri Pi 0

Let’s face it: AI is awesome.
But running AI locally, on your own server, offline, without sending anything to the cloud?
That’s next-level awesome.

In this post, we’re setting up our very own private LLM chat server using Ollama and Open WebUI.
Fully containerized with Docker. One YAML file. Local AI fun. 💻🧠

🤔 Wait, What’s Ollama?

Ollama makes it super simple to run large language models (LLMs) locally. Think ChatGPT, but it lives on your own hardware.

Open WebUI gives you a clean browser interface to interact with your local models.

No API keys. No cloud. Just you and your AI.

🧱 docker-compose.yml

Here’s the full docker-compose.yml:

version: "3.8"

services:
  ollama:
    image: ollama/ollama
    container_name: ollama
    ports:
      - "11434:11434"
    volumes:
      - ollama_data:/root/.ollama
    restart: unless-stopped

  openwebui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: openwebui
    ports:
      - "3000:8080"
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
    depends_on:
      - ollama
    volumes:
      - openwebui_data:/app/backend/data
    restart: unless-stopped

volumes:
  ollama_data:
  openwebui_data:

Spin it up with:

sudo docker compose -p local-ai up -d

🤖 Pulling Your AI Models

Once running, pull some models with:

docker exec -it ollama ollama pull deepseek-r1:latest
docker exec -it ollama ollama pull gemma3:latest
docker exec -it ollama ollama pull phi4:latest

(These can take some time and storage — models can be multiple GBs.)

💬 Accessing the Chat Interface

Once it’s all up, open your browser and go to:

http://<your-server-ip>:3000

Choose a model and start chatting — locally and privately.

⚠️ Note on Performance (especially on Raspberry Pi)

Yes — you can run this on a Raspberry Pi (especially Pi 5 with active cooling).
But… it’s slow. 🐢
RAM is limited, there’s no GPU, and response times can be frustrating.

🔧 Recommended Setup for Smooth LLM Use

If you want faster and smoother experience, consider running this on:

✅ At least 16 GB RAM
✅ A modern x86 CPU (e.g., Ryzen 5 5600G or Intel i5+)
✅ Optional: GPU support (NVIDIA RTX 3060 or better for even faster inference)
✅ Fast SSD storage (models load faster)
✅ Linux or WSL2 on Windows

Even an old laptop with decent specs will do better than a Pi.

🧠 Why This Rocks

🛡️ Privacy-first AI
🔌 No cloud required
🧰 Self-hosted and hackable
🖥️ Clean web UI for everyday use
🚫 No token limits or subscriptions