Re-enable VectorRAG init with lazy retry

Personal Docs (POST /api/personal/add_directory and friends) currently
returns HTTP 503 'RAG system is not available' for every request,
because get_rag_manager() and rag_manager are both hardcoded off. The
disablement was added when chromadb 1.4.1 / pydantic 2.12 were mutually
incompatible at the client init layer.

That compat issue is fixed in the current pins (chromadb 1.5.x +
pydantic 2.13.x). Verified by calling the original lazy initializer
against a running chroma server — VectorRAG instantiates, reports
healthy=True, and indexes successfully.

This change:

1. src/rag_singleton.py — replace the hardcoded `return None` in
   get_rag_manager() with the original lazy init body. Keeps the
   30s retry-throttle so a missing chroma server doesn't busy-retry
   on every request.

2. app.py — replace the parallel `rag_manager = None` /
   `rag_available = False` hardcoding with a get_rag_manager() call.
   Logs the resolved state at startup. If chroma isn't reachable yet,
   rag_manager stays None and personal-doc routes still return 503,
   but the *next* request will hit the retry-throttle path in
   get_rag_manager() and try to init again.

Doesn't touch requirements.txt. Repos using docker-compose get chroma
automatically; manual installs that want Personal Docs to work still
need to either pip install chromadb (full package) and run `chroma run`
or point at an external chroma instance via env. That can be a
follow-up README / requirements-optional note.
This commit is contained in:
LittleLlama
2026-05-31 22:32:13 -07:00
committed by GitHub
parent 93d3cc49c2
commit 7e7e441fec
2 changed files with 33 additions and 17 deletions

View File

@@ -12,16 +12,21 @@ rag_instance = None
_last_attempt = 0.0
_RETRY_INTERVAL = 30 # seconds between re-init attempts
def get_rag_manager():
"""Disabled: vector document RAG (VectorRAG/ChromaDB) is unused and its
client is incompatible with the installed pydantic. Return None so personal-
doc routes fall back to non-vector behavior instead of re-attempting (and
re-hanging on) a broken ChromaDB init every 30s."""
return None
"""Lazy ChromaDB-backed VectorRAG initializer.
Returns the VectorRAG instance on first successful init, None if ChromaDB
isn't reachable / available. Failed init attempts are throttled to once
per _RETRY_INTERVAL seconds so a missing ChromaDB doesn't busy-retry on
every request — callers (personal-doc routes etc.) get None back and
return a clean 503 to the user instead.
def _get_rag_manager_legacy():
"""Original lazy initializer, kept for reference / easy re-enable."""
Historical note: this used to be hardcoded to ``return None`` with a
comment about chromadb 1.4.1 / pydantic 2.12 being mutually incompatible.
That compat issue is resolved in current pinned versions
(chromadb 1.5.x + pydantic 2.13.x), so the real initializer is back.
"""
global rag_instance, _last_attempt
if rag_instance is not None:
@@ -29,7 +34,7 @@ def _get_rag_manager_legacy():
now = time.monotonic()
if now - _last_attempt < _RETRY_INTERVAL:
return None # too soon to retry
return None # too soon to retry — last attempt failed
_last_attempt = now