Appearance
Open WebUI Setup - Project Documentation
Date: January 20, 2026
Project: Open WebUI Installation & Configuration
Goal: Local LLM environment with cloud integration
1. Accomplished Steps
1.1 Ollama Installation (Local)
- ✅ Ollama installed on the local system.
- ✅ Successfully established connection to Open WebUI (
http://ollama:11434). - ✅ Local models are fully operational and accessible.
1.2 Configured Local Models
| Model Name | Size | Target Use Case | Status |
|---|---|---|---|
| gemma4-4b-checkin-fast:latest | 4B | Rapid requests, fast checkin | ✅ Active |
| gemma4:e4b | 4B | General reasoning tasks | ✅ Active |
| mistral:7b | 7B | Coding and technical assistance | ✅ Active |
| qwen3:8b | 8B | General reasoning and analysis | ✅ Active |
1.3 Google Gemini Integration
- ✅ Bound Google AI Studio API Key.
- ✅ Configured OpenAI-compatible API connector endpoint.
- ✅ All core Gemini models are fully accessible.
API Connector Parameters:
yaml
Base URL: https://generativelanguage.googleapis.com/v1beta/openai/
API Key: AIzaSyC... [REDACTED FOR SECURITY]
Provider Type: OpenAI compatibleAccessible Gemini Models:
gemini-2.0-flash-exp(Recommended for fast responses)gemini-1.5-pro(Deep logical analysis)gemini-1.5-flash(Balanced performance/speed)
2. Key Learnings & Best Practices
2.1 Model Selection Strategy
Local Hosting vs. Cloud Hosting:
- Local (Ollama): High privacy (zero third-party data transmission), zero running API costs, offline operational capacity.
- Cloud (Gemini): Maximum capabilities, search grounding, standard API volume costs.
Operational Recommendations:
- Simple Requests ---> Local Models (
gemma4-4b-checkin-fast:latest,mistral:7b) - Multi-step / Hard reasoning --->
gemini-1.5-proorqwen3:8b - Rapid Cloud queries --->
gemini-2.0-flash-exp
2.2 System Prompts Tuning
System prompts can be set on a per-model basis: Modelling Dashboard ---> Select Model ---> Edit ---> System Prompt
System Prompt Example for gemma4-4b-checkin-fast:latest (Web Search Optimization):
text
You rely EXCLUSIVELY on provided web search results to answer the query.
Synthesize the facts concisely and cite facts exactly.System Prompt Example for Gemini 2.0 Flash:
text
You are a highly precise and structured assistant.
Respond concisely, utilizing markdown for optimal readability.2.3 Connections Routing
Settings ---> Connections:
- OpenAI API: For external cloud engines (Gemini, OpenAI).
- Ollama API: For local containerized models.
- Direct Connectors: For dedicated custom model API endpoints.
2.4 Performance Tuning
- Enable base models list caching: Speeds up model listings and GUI loading.
- Model Size Rules of Thumb:
- 3B-7B: Fast responses, lower conceptual reasoning.
- 8B-14B: Excellent balance of speed and logic.
- 70B+: Extremely accurate, high computing demands.
3. Current Project State
3.1 Active Infrastructure Components
- ✅ Open WebUI: Modern web interface.
- ✅ Ollama: Local LLM compute host.
- ✅ Google Gemini API: Cloud-grounded reasoning.
3.2 Available LLMs Matrix
Local Catalog (Ollama):
gemma4-4b-checkin-fast:latestgemma4:e4bmistral:7bqwen3:8b
Cloud Catalog (Google Gemini):
gemini-2.0-flash-expgemini-1.5-progemini-1.5-flash
3.2 Open Backlog
- ⏸️ Fine-tune system prompts on all models.
- ⏸️ Evaluate
qwen3:8bperformance balance. - ⏸️ Integrate workflow automation with n8n.
- ⏸️ Evaluate additional API providers (Anthropic Claude, OpenAI).
4. Technical CLI & API Reference
4.1 Connection Endpoints
Ollama Local Engine:
text
http://ollama:11434Google Gemini Gateway:
text
https://generativelanguage.googleapis.com/v1beta/openai/4.2 Essential Commands
Ollama Model Administration:
bash
# List all active models downloaded on host
ollama list
# Pull a new model from Ollama registry
ollama pull qwen3:8b
# Remove a model
ollama rm mistral:7b
# Test a model directly inside host terminal
ollama run gemma4-4b-checkin-fast:latest4.3 Troubleshooting Guide
Error: Ollama API is Unreachable
Remediation steps:
- Check Ollama docker service state.
- Ensure the connection URL matches:
http://ollama:11434. - Check local firewall rules (UFW ports).
Error: Gemini Cloud API Connection Failure
Remediation steps:
- Re-verify the API key validity in Google AI Studio.
- Ensure the gateway URL ends with a trailing slash (
/). - Verify the provider type parameter is set to "OpenAI".
5. Next Steps Roadmap
- Optimize system prompts across all models.
- Conduct task alignment testing (determine optimal model assignments).
- Monitor cloud API costs dynamically.
- Evaluate advanced models (Anthropic Claude, GPT-4).
- Secure backup routines for Open WebUI user database volumes.
6. Knowledge Base Prompt
For future system reference:
text
Open WebUI Setup - State: Jan 20, 2026
Active components:
- Ollama (local): gemma4-4b-checkin-fast:latest, gemma4:e4b, mistral:7b, qwen3:8b
- Google Gemini API: gemini-2.0-flash-exp, gemini-1.5-pro, gemini-1.5-flash
Endpoints & APIs:
- Ollama Connection: http://ollama:11434
- Gemini Connection: https://generativelanguage.googleapis.com/v1beta/openai/
- API Key: Registered in Open WebUI connections panel
Model Routing Guidelines:
- Standard / Web queries: gemma4-4b-checkin-fast:latest (local)
- Programming / Debugging: mistral:7b (local)
- Hard reasoning / Deep analysis: qwen3:8b (local) or gemini-1.5-pro (cloud)
- Fast cloud tasks: gemini-2.0-flash-expDocumentation compiled on: 20.01.2026
Version: 1.0
Author: Christian Friedrich Schacht