Your Own Local ChatGPT β€” With Ollama + Open WebUI πŸ€–βœ¨

Your Own Local ChatGPT β€” With Ollama + Open WebUI πŸ€–βœ¨

29.05.2025 Raspberri Pi 0

Let’s face it: AI is awesome.
But running AI locally, on your own server, offline, without sending anything to the cloud?
That’s next-level awesome.

In this post, we’re setting up our very own private LLM chat server using Ollama and Open WebUI.
Fully containerized with Docker. One YAML file. Local AI fun. πŸ’»πŸ§ 


πŸ€” Wait, What’s Ollama?

Ollama makes it super simple to run large language models (LLMs) locally. Think ChatGPT, but it lives on your own hardware.

Open WebUI gives you a clean browser interface to interact with your local models.

No API keys. No cloud. Just you and your AI.


🧱 docker-compose.yml

Here’s the full docker-compose.yml:

version: "3.8"

services:
ollama:
image: ollama/ollama
container_name: ollama
ports:
- "11434:11434"
volumes:
- ollama_data:/root/.ollama
restart: unless-stopped

openwebui:
image: ghcr.io/open-webui/open-webui:main
container_name: openwebui
ports:
- "3000:8080"
environment:
- OLLAMA_BASE_URL=http://ollama:11434
depends_on:
- ollama
volumes:
- openwebui_data:/app/backend/data
restart: unless-stopped

volumes:
ollama_data:
openwebui_data:

Spin it up with:

sudo docker compose -p local-ai up -d

πŸ€– Pulling Your AI Models

Once running, pull some models with:

docker exec -it ollama ollama pull deepseek-r1:latest
docker exec -it ollama ollama pull gemma3:latest
docker exec -it ollama ollama pull phi4:latest

(These can take some time and storage β€” models can be multiple GBs.)


πŸ’¬ Accessing the Chat Interface

Once it’s all up, open your browser and go to:

http://<your-server-ip>:3000

Choose a model and start chatting β€” locally and privately.

⚠️ Note on Performance (especially on Raspberry Pi)

Yes β€” you can run this on a Raspberry Pi (especially Pi 5 with active cooling).
But… it’s slow. 🐒
RAM is limited, there’s no GPU, and response times can be frustrating.

πŸ”§ Recommended Setup for Smooth LLM Use

If you want faster and smoother experience, consider running this on:

  • βœ… At least 16 GB RAM
  • βœ… A modern x86 CPU (e.g., Ryzen 5 5600G or Intel i5+)
  • βœ… Optional: GPU support (NVIDIA RTX 3060 or better for even faster inference)
  • βœ… Fast SSD storage (models load faster)
  • βœ… Linux or WSL2 on Windows

Even an old laptop with decent specs will do better than a Pi.


🧠 Why This Rocks

  • πŸ›‘οΈ Privacy-first AI
  • πŸ”Œ No cloud required
  • 🧰 Self-hosted and hackable
  • πŸ–₯️ Clean web UI for everyday use
  • 🚫 No token limits or subscriptions