Open WebUI Setup - Project Documentation

Date: January 20, 2026
Project: Open WebUI Installation & Configuration
Goal: Local LLM environment with cloud integration

1. Accomplished Steps

1.1 Ollama Installation (Local)

✅ Ollama installed on the local system.
✅ Successfully established connection to Open WebUI (http://ollama:11434).
✅ Local models are fully operational and accessible.

1.2 Configured Local Models

Model Name	Size	Target Use Case	Status
gemma4-4b-checkin-fast:latest	4B	Rapid requests, fast checkin	✅ Active
gemma4:e4b	4B	General reasoning tasks	✅ Active
mistral:7b	7B	Coding and technical assistance	✅ Active
qwen3:8b	8B	General reasoning and analysis	✅ Active

1.3 Google Gemini Integration

✅ Bound Google AI Studio API Key.
✅ Configured OpenAI-compatible API connector endpoint.
✅ All core Gemini models are fully accessible.

API Connector Parameters:

yaml

Base URL: https://generativelanguage.googleapis.com/v1beta/openai/
API Key: AIzaSyC... [REDACTED FOR SECURITY]
Provider Type: OpenAI compatible

Accessible Gemini Models:

gemini-2.0-flash-exp (Recommended for fast responses)
gemini-1.5-pro (Deep logical analysis)
gemini-1.5-flash (Balanced performance/speed)

2. Key Learnings & Best Practices

2.1 Model Selection Strategy

Local Hosting vs. Cloud Hosting:

Local (Ollama): High privacy (zero third-party data transmission), zero running API costs, offline operational capacity.
Cloud (Gemini): Maximum capabilities, search grounding, standard API volume costs.

Operational Recommendations:

Simple Requests ---> Local Models (gemma4-4b-checkin-fast:latest, mistral:7b)
Multi-step / Hard reasoning ---> gemini-1.5-pro or qwen3:8b
Rapid Cloud queries ---> gemini-2.0-flash-exp

2.2 System Prompts Tuning

System prompts can be set on a per-model basis: Modelling Dashboard ---> Select Model ---> Edit ---> System Prompt

System Prompt Example for gemma4-4b-checkin-fast:latest (Web Search Optimization):

text

You rely EXCLUSIVELY on provided web search results to answer the query.
Synthesize the facts concisely and cite facts exactly.

System Prompt Example for Gemini 2.0 Flash:

text

You are a highly precise and structured assistant.
Respond concisely, utilizing markdown for optimal readability.

2.3 Connections Routing

Settings ---> Connections:

OpenAI API: For external cloud engines (Gemini, OpenAI).
Ollama API: For local containerized models.
Direct Connectors: For dedicated custom model API endpoints.

2.4 Performance Tuning

Enable base models list caching: Speeds up model listings and GUI loading.
Model Size Rules of Thumb:
- 3B-7B: Fast responses, lower conceptual reasoning.
- 8B-14B: Excellent balance of speed and logic.
- 70B+: Extremely accurate, high computing demands.

3. Current Project State

3.1 Active Infrastructure Components

✅ Open WebUI: Modern web interface.
✅ Ollama: Local LLM compute host.
✅ Google Gemini API: Cloud-grounded reasoning.

3.2 Available LLMs Matrix

Local Catalog (Ollama):

gemma4-4b-checkin-fast:latest
gemma4:e4b
mistral:7b
qwen3:8b

Cloud Catalog (Google Gemini):

gemini-2.0-flash-exp
gemini-1.5-pro
gemini-1.5-flash

3.2 Open Backlog

⏸️ Fine-tune system prompts on all models.
⏸️ Evaluate qwen3:8b performance balance.
⏸️ Integrate workflow automation with n8n.
⏸️ Evaluate additional API providers (Anthropic Claude, OpenAI).

4. Technical CLI & API Reference

4.1 Connection Endpoints

Ollama Local Engine:

text

http://ollama:11434

Google Gemini Gateway:

text

https://generativelanguage.googleapis.com/v1beta/openai/

4.2 Essential Commands

Ollama Model Administration:

bash

# List all active models downloaded on host
ollama list

# Pull a new model from Ollama registry
ollama pull qwen3:8b

# Remove a model
ollama rm mistral:7b

# Test a model directly inside host terminal
ollama run gemma4-4b-checkin-fast:latest

4.3 Troubleshooting Guide

Error: Ollama API is Unreachable
Remediation steps:

Check Ollama docker service state.
Ensure the connection URL matches: http://ollama:11434.
Check local firewall rules (UFW ports).

Error: Gemini Cloud API Connection Failure
Remediation steps:

Re-verify the API key validity in Google AI Studio.
Ensure the gateway URL ends with a trailing slash (/).
Verify the provider type parameter is set to "OpenAI".

5. Next Steps Roadmap

Optimize system prompts across all models.
Conduct task alignment testing (determine optimal model assignments).
Monitor cloud API costs dynamically.
Evaluate advanced models (Anthropic Claude, GPT-4).
Secure backup routines for Open WebUI user database volumes.

6. Knowledge Base Prompt

For future system reference:

text

Open WebUI Setup - State: Jan 20, 2026

Active components:
- Ollama (local): gemma4-4b-checkin-fast:latest, gemma4:e4b, mistral:7b, qwen3:8b
- Google Gemini API: gemini-2.0-flash-exp, gemini-1.5-pro, gemini-1.5-flash

Endpoints & APIs:
- Ollama Connection: http://ollama:11434
- Gemini Connection: https://generativelanguage.googleapis.com/v1beta/openai/
- API Key: Registered in Open WebUI connections panel

Model Routing Guidelines:
- Standard / Web queries: gemma4-4b-checkin-fast:latest (local)
- Programming / Debugging: mistral:7b (local)
- Hard reasoning / Deep analysis: qwen3:8b (local) or gemini-1.5-pro (cloud)
- Fast cloud tasks: gemini-2.0-flash-exp

Documentation compiled on: 20.01.2026
Version: 1.0
Author: Christian Friedrich Schacht

Open WebUI Setup - Project Documentation ​

1. Accomplished Steps ​

1.1 Ollama Installation (Local) ​

1.2 Configured Local Models ​

1.3 Google Gemini Integration ​

2. Key Learnings & Best Practices ​

2.1 Model Selection Strategy ​

2.2 System Prompts Tuning ​

2.3 Connections Routing ​

2.4 Performance Tuning ​

3. Current Project State ​

3.1 Active Infrastructure Components ​

3.2 Available LLMs Matrix ​

3.2 Open Backlog ​

4. Technical CLI & API Reference ​

4.1 Connection Endpoints ​

4.2 Essential Commands ​

4.3 Troubleshooting Guide ​

5. Next Steps Roadmap ​

6. Knowledge Base Prompt ​

Open WebUI Setup - Project Documentation

1. Accomplished Steps

1.1 Ollama Installation (Local)

1.2 Configured Local Models

1.3 Google Gemini Integration

2. Key Learnings & Best Practices

2.1 Model Selection Strategy

2.2 System Prompts Tuning

2.3 Connections Routing

2.4 Performance Tuning

3. Current Project State

3.1 Active Infrastructure Components

3.2 Available LLMs Matrix

3.2 Open Backlog

4. Technical CLI & API Reference

4.1 Connection Endpoints

4.2 Essential Commands

4.3 Troubleshooting Guide

5. Next Steps Roadmap

6. Knowledge Base Prompt