Topics: hydrate session history before analysis

analyze_topics() iterates session_manager.sessions and reads
session_data.get("history", []) directly. But SessionManager.load_sessions
seeds sessions metadata-only with empty history — messages are loaded
lazily, only when get_session(session_id) is called. So analyze_topics saw
empty history for every session that hadn't been individually opened this
process lifetime and reported total_topics: 0, even when the database held
plenty of matching messages.

Hydrate each candidate session via session_manager.get_session(session_id)
(the existing lazy-load path) before reading its history, after the
owner/archived filters so skipped sessions aren't loaded. Falls back to the
raw cached history when the manager has no get_session (test stubs).

tests/test_topic_analyzer.py: new test_topic_analyzer_hydrates_sessions
seeds a real SQLite DB with a session + message, runs the real
SessionManager (asserting cached history starts empty), then asserts
analyze_topics finds the topic. Fails before this change. The existing
keyword tests now pass an explicit owner to satisfy the owner-required
early return.
This commit is contained in:
Tatlatat
2026-06-02 18:44:27 +07:00
committed by GitHub
parent d73c0a13f4
commit 7f97ab3032
2 changed files with 80 additions and 6 deletions

View File

@@ -49,7 +49,15 @@ def analyze_topics(session_manager, owner: str = None) -> Dict[str, Any]:
if sess_owner != owner:
continue
for msg in session_data.get("history", []):
# Hydrate session to load history from DB if needed
if hasattr(session_manager, "get_session"):
hydrated_session = session_manager.get_session(session_id)
history = hydrated_session.history
else:
hydrated_session = session_data
history = session_data.get("history", [])
for msg in history:
content_raw = msg.get("content") if isinstance(msg, dict) else getattr(msg, "content", None)
if not content_raw:
continue