← All articles
A computer sitting on top of a wooden desk

Open WebUI: The Self-Hosted AI Interface That Does More Than Chat

Development 2026-02-14 · 5 min read open-webui llm ai rag self-hosted-ai chatgpt-alternative
By Selfhosted Guides Editorial TeamSelf-hosting practitioners covering open source software, home lab infrastructure, and data sovereignty.

If you have followed our guide on running Ollama locally, you already know Open WebUI as a chat interface for local models. But reducing it to "a ChatGPT skin for Ollama" misses most of what it does. Open WebUI has grown into a full AI platform -- a self-hosted gateway that connects to multiple backends, runs retrieval-augmented generation pipelines, supports tool calling, and provides granular user management. It is the closest thing to a self-hosted ChatGPT Teams deployment that actually works.

Photo by P. L. on Unsplash

Open WebUI self-hosted AI interface logo

Beyond Ollama: Multi-Backend Support

The most underappreciated feature of Open WebUI is that it is not tied to Ollama. You can connect it to any OpenAI-compatible API endpoint, which means you can run a single Open WebUI instance that gives your team access to:

All of these show up in the same model dropdown. Your users pick the best model for their task without needing separate accounts or interfaces.

Configuring Multiple Backends

In the Admin panel under Settings > Connections, add each backend:

# Ollama (local)
URL: http://ollama:11434

# OpenAI
URL: https://api.openai.com/v1
API Key: sk-...

# Self-hosted vLLM
URL: http://vllm-server:8000/v1
API Key: token-if-needed

Each connection can have its own API key, and models from all backends appear in the unified model list.

RAG: Chat With Your Documents

Open WebUI's RAG (Retrieval-Augmented Generation) pipeline lets users upload documents and ask questions about them. This is not a toy demo -- it uses proper chunking, vector embeddings, and retrieval to ground model responses in your actual data.

How It Works

  1. Upload a PDF, markdown file, or text document to a conversation or a shared knowledge collection
  2. Open WebUI chunks the document and generates embeddings using a configurable embedding model
  3. When you ask a question, it retrieves relevant chunks and includes them in the model's context
  4. The model answers based on the retrieved content, with citations

Docker Compose With RAG Dependencies

services:
  ollama:
    image: ollama/ollama:latest
    volumes:
      - ollama_data:/root/.ollama
    ports:
      - "11434:11434"
    restart: unless-stopped

  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    ports:
      - "3000:8080"
    volumes:
      - open_webui_data:/app/backend/data
    environment:
      OLLAMA_BASE_URL: http://ollama:11434
      # RAG configuration
      RAG_EMBEDDING_MODEL: "nomic-embed-text:latest"
      CHUNK_SIZE: "1000"
      CHUNK_OVERLAP: "200"
      RAG_TOP_K: "5"
      # Optional: use OpenAI embeddings instead
      # RAG_OPENAI_API_BASE_URL: https://api.openai.com/v1
      # RAG_OPENAI_API_KEY: sk-...
      # RAG_EMBEDDING_ENGINE: openai
      # RAG_EMBEDDING_MODEL: text-embedding-3-small
    depends_on:
      - ollama
    restart: unless-stopped

volumes:
  ollama_data:
  open_webui_data:

Pull the embedding model after startup:

docker exec ollama ollama pull nomic-embed-text:latest

Knowledge Collections

Beyond per-conversation uploads, you can create persistent Knowledge collections in the workspace. These are shared document sets that any conversation can reference. This is useful for team knowledge bases -- upload your internal docs, runbooks, or code documentation once, and every team member can query them.

User Management and RBAC

Open WebUI has a proper multi-user system with role-based access control:

Key Admin Settings

# Environment variables for user management
ENABLE_SIGNUP: "true"          # Allow new registrations
DEFAULT_USER_ROLE: "pending"   # Require admin approval
ENABLE_LOGIN_FORM: "true"      # Show email/password login
WEBUI_AUTH: "true"             # Require authentication

You can also configure OAuth/OIDC for SSO integration with Authentik, Keycloak, or any other identity provider:

ENABLE_OAUTH_SIGNUP: "true"
OAUTH_CLIENT_ID: "open-webui"
OAUTH_CLIENT_SECRET: "your-secret"
OAUTH_PROVIDER_NAME: "Authentik"
OPENID_PROVIDER_URL: "https://auth.example.com/application/o/open-webui/.well-known/openid-configuration"

This is what makes Open WebUI viable for teams. Each user gets their own conversation history, and admins control which models are available.

Like what you're reading? Subscribe to Self-Hosted Weekly — free weekly guides in your inbox.

Pipelines and Functions

Open WebUI's pipeline system lets you extend its behavior with custom Python functions. Pipelines sit between the user's message and the model, allowing you to:

Example: Web Search Pipeline

Install the web search function from the community hub (accessible in the Admin panel), or write your own:

class Pipeline:
    def __init__(self):
        self.name = "Web Search"

    async def pipe(self, body, __user__):
        # Extract search queries from user message
        # Call a search API
        # Inject results into context
        # Return augmented prompt to model
        pass

The pipeline system is Open WebUI's most powerful feature and what separates it from simple chat wrappers.

Model Customization

Beyond selecting models, Open WebUI lets you create custom model profiles called Modelfiles. These combine a base model with:

This lets you create purpose-built assistants -- a "Code Reviewer" that uses a coding model with strict formatting instructions, a "Research Assistant" that always searches the web, or a "Company FAQ Bot" that draws from your knowledge base.

Practical Deployment Tips

Reverse Proxy Configuration

Behind Nginx or Caddy, make sure WebSocket connections work. Open WebUI uses them for streaming responses:

# Caddy example
ai.example.com {
    reverse_proxy open-webui:8080
}

Caddy handles WebSocket upgrades automatically. For Nginx, add the standard WebSocket proxy headers.

Persistent Storage

The /app/backend/data volume contains everything: user accounts, conversation history, uploaded documents, and vector embeddings. Back this up regularly. A corrupted or lost data volume means losing all conversations and RAG knowledge.

Resource Considerations

Open WebUI itself is lightweight -- the heavy lifting happens in Ollama or whatever backend you connect. The main resource consumers are:

Updates

Open WebUI ships new features frequently. Update with:

docker compose pull open-webui
docker compose up -d open-webui

Check the changelog before major updates -- database migrations sometimes require attention.

When to Use Open WebUI vs. Alternatives

Open WebUI is best when you want a unified interface for multiple AI backends with team features. It is the right choice for small teams that want a private ChatGPT-like experience.

LibreChat is a strong alternative if you need more advanced conversation branching and preset management. It is also open source and supports multiple backends.

text-generation-webui (oobabooga) is better if you need deep control over model loading, quantization, and inference parameters. It is more of a power-user tool than a team platform.

AnythingLLM focuses more on the RAG and workspace angle, with built-in document management and agent capabilities.

The Bottom Line

Open WebUI has evolved from a simple Ollama frontend into the most capable self-hosted AI interface available. The combination of multi-backend support, RAG pipelines, user management, and custom functions makes it a legitimate platform for teams that want control over their AI tools. If you are already running Ollama, upgrading to a full Open WebUI deployment with RAG and multi-backend support takes about ten minutes and dramatically expands what you can do.

Get free weekly tips in your inbox. Subscribe to Self-Hosted Weekly