Self-Hosted AI Stack in 2026: Ollama + Open WebUI + n8n Complete Setup Guide
The self-hosted AI movement has gone mainstream. What used to require a GPU cluster and a PhD now runs on a single VPS with Docker Compose for under $10/month. This guide walks through the exact stack that’s become the standard in 2026: Ollama for running local models, Open WebUI for the chat interface, and n8n for AI-powered workflow automation.
Everything in this guide is free and open-source. You own all the code, all the data, and all the infrastructure. No subscriptions, no data leaving your server, no vendor lock-in.
Why Self-Host Your AI Stack in 2026?
Three reasons make 2026 the right time to self-host:
Cost. Running OpenAI or Anthropic’s APIs costs money per token. Ollama on a $7 VPS is free after the hardware cost. For moderate usage, this is $7/month vs $20-200/month in API fees.
Privacy. Nothing leaves your server. Your conversations, your files, your workflows — all stored locally. For professionals handling sensitive data, this is non-negotiable.
Control. You decide which models to run, when to update, and how to configure everything. No product decisions made by a vendor that affect your workflow.
The Stack: What Each Tool Does
| Tool | Role | Replaces |
|—|—|—|
| Ollama | Local LLM inference server | OpenAI API, Anthropic API |
| Open WebUI | Chat interface with RAG, voice, vision | ChatGPT interface |
| n8n | AI workflow automation | Zapier + AI plugins |
Step 1: Choose Your Hardware
Minimum for a functional stack: 4 GB RAM, 2 vCPU, 20 GB SSD. A $7 VPS (Hetzner CX22, Vultr Basic) handles this fine for personal use.
Recommended for better model performance: 8 GB RAM, 4 vCPU. You can run larger models (7B-13B parameters) without slowdown.
Step 2: Install Docker
`bash
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER
docker –version
docker compose version
`
Log out and back in for group changes to take effect.
Step 3: Deploy the Complete Stack
Create a docker-compose.yml:
`yaml
version: ‘3.8’
services:
ollama:
image: ollama/ollama:latest
container_name: ollama
restart: unless-stopped
ports:
– “11434:11434”
volumes:
– ollama_data:/root/.ollama
open-webui:
image: ghcr.io/open-webui/open-webui:main
container_name: open-webui
restart: unless-stopped
ports:
– “3000:8080”
environment:
– OLLAMA_BASE_URL=http://ollama:11434
volumes:
– open_webui_data:/app/backend/data
extra_hosts:
– “host.docker.internal:host-gateway”
depends_on:
– ollama
n8n:
image: n8nio/n8n:latest
container_name: n8n
restart: unless-stopped
ports:
– “5678:5678”
environment:
– NODE_ENV=production
volumes:
– n8n_data:/home/node/.local/share/n8n
extra_hosts:
– “host.docker.internal:host-gateway”
volumes:
ollama_data:
open_webui_data:
n8n_data:
`
Start the stack:
`bash
docker compose up -d
`
Step 4: Pull Your First Model
Wait 30 seconds, then:
`bash
docker exec -it ollama ollama pull llama3.2:8b
`
Other good models to start with:
mistral:7b— Fast, good quality, runs on 4 GB RAM
qwen2.5:7b— Strong coding abilities
nomic-embed-text— For embeddings and RAG
Step 5: Access Your Services
| Service | URL | Purpose |
|—|—|—|
| Open WebUI | http://your-server:3000 | Chat interface |
| Ollama API | http://your-server:11434 | Direct API access |
| n8n | http://your-server:5678 | Workflow automation |
Open Open WebUI in your browser and create an admin account on first launch.
Step 6: Connect n8n to Ollama
In n8n:
1. Go to Credentials → New Credential → Ollama
2. Set Base URL: http://ollama:11434
3. Select your model (e.g., llama3.2:8b)
4. Save and test
You now have a local LLM available as a node in any n8n workflow.
Useful n8n + Ollama Workflow Ideas
Daily digest automation: n8n pulls data from your APIs, sends summary to Open WebUI via Ollama for processing, outputs a formatted daily briefing.
Email triage: Incoming emails go to n8n, Ollama classifies by urgency and drafts replies, you review and send.
Document Q&A: Upload documents to n8n, send chunks to Ollama for embedding and storage in a vector database, query with natural language.
Meeting notes processing: Audio → transcription → Ollama for summarization and action item extraction → save to Notion or your task manager.
Securing Your Stack
This setup is fine for local network use. For production or public access:
Use a reverse proxy (Nginx or Caddy) with HTTPS. Never expose raw HTTP ports.
Enable authentication on Open WebUI (built-in user management) and n8n (setup on first login).
Firewall rules: Only ports 80/443 (via reverse proxy) should be exposed. Everything else stays internal to Docker.
Lock down Ollama: The default Ollama install has no authentication. Keep it bound to 127.0.0.1 or your Docker network, not 0.0.0.0.
Updating the Stack
`bash
docker compose pull
docker compose up -d
docker image prune -f
`
Done. Your config and data persist via Docker volumes.
The Bottom Line
This stack — Ollama + Open WebUI + n8n — gives you a private AI lab for $7/month. No API costs, no data leaving your server, full control over models and workflows. Start with one workflow that saves you 30 minutes a day. Once it’s running reliably, expand from there.
Related Articles
Get Notified About New Articles
One email per week when I publish a new article or update an existing one. New AI tool reviews, deployment updates, behind-the-scenes notes. No marketing, no spam, unsubscribe in one click.
Or learn more · RSS feed
Get Notified About New Articles
One email per week when I publish a new article or update an existing one. No marketing, no spam.