Re-enable VectorRAG init with lazy retry

Personal Docs (POST /api/personal/add_directory and friends) currently
returns HTTP 503 'RAG system is not available' for every request,
because get_rag_manager() and rag_manager are both hardcoded off. The
disablement was added when chromadb 1.4.1 / pydantic 2.12 were mutually
incompatible at the client init layer.

That compat issue is fixed in the current pins (chromadb 1.5.x +
pydantic 2.13.x). Verified by calling the original lazy initializer
against a running chroma server — VectorRAG instantiates, reports
healthy=True, and indexes successfully.

This change:

1. src/rag_singleton.py — replace the hardcoded `return None` in
   get_rag_manager() with the original lazy init body. Keeps the
   30s retry-throttle so a missing chroma server doesn't busy-retry
   on every request.

2. app.py — replace the parallel `rag_manager = None` /
   `rag_available = False` hardcoding with a get_rag_manager() call.
   Logs the resolved state at startup. If chroma isn't reachable yet,
   rag_manager stays None and personal-doc routes still return 503,
   but the *next* request will hit the retry-throttle path in
   get_rag_manager() and try to init again.

Doesn't touch requirements.txt. Repos using docker-compose get chroma
automatically; manual installs that want Personal Docs to work still
need to either pip install chromadb (full package) and run `chroma run`
or point at an external chroma instance via env. That can be a
follow-up README / requirements-optional note.
This commit is contained in:
LittleLlama
2026-05-31 22:32:13 -07:00
committed by GitHub
parent 93d3cc49c2
commit 7e7e441fec
2 changed files with 33 additions and 17 deletions

29
app.py
View File

@@ -355,15 +355,26 @@ async def serve_generated_image(filename: str, request: Request):
from services.youtube import init_youtube
init_youtube()
# ========= RAG (vector document RAG — DISABLED) =========
# VectorRAG (ChromaDB-backed personal-document semantic search) is unused
# (0 directories ever indexed) and its chromadb 1.4.1 / pydantic 2.12 client
# can't even instantiate — it threw at init and cost ~30s of startup waiting on
# the embedding probe. Disabled. All callers already guard on rag_available /
# `if rag_manager`, so personal-doc routes degrade cleanly.
rag_manager = None
rag_available = False
logger.info("Vector document RAG disabled (unused)")
# ========= RAG (vector document RAG) =========
# VectorRAG (ChromaDB-backed personal-document semantic search). Initialized
# lazily via get_rag_manager() — returns None if ChromaDB isn't reachable
# (no server running on the configured host:port), in which case personal-doc
# routes return a clean 503 instead of busy-retrying every request.
#
# Note: this was previously hardcoded off because chromadb 1.4.1 / pydantic
# 2.12 were mutually incompatible at the time. With the current pins
# (chromadb 1.5.x + pydantic 2.13.x) the init works and Personal Docs
# (POST /api/personal/add_directory etc.) is functional again.
from src.rag_singleton import get_rag_manager
rag_manager = get_rag_manager()
rag_available = rag_manager is not None
if rag_available:
logger.info("Vector document RAG initialized")
else:
logger.info(
"Vector document RAG not available at startup "
"(ChromaDB may not be reachable yet — routes will retry lazily)"
)
# ========= IMPORT CONFIG =========
from src.config import config