Route "read that report" to manage_research instead of the HTML render (#1375)

After a deep-research job completes, a follow-up like "check it out" / "read that report" had the agent web_fetch the /api/research/report/{id} HTML render (and then drift into unrelated searches) instead of reading the saved report (issue #1363). The report text is already available via the manage_research tool (action read), and action list returns ids most-recent-first, so the agent can resolve "the recent report" itself. Strengthen the manage_research instructions: read a finished report via action list -> action read; do NOT web_fetch/app_api the report URL (it renders HTML, not clean text) and do NOT start a fresh web_search just to read an existing report. Annotate the app_api endpoint list to say the same. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 02:24:09 +08:00
parent b54468291e
commit b6843c7621
2 changed files with 69 additions and 2 deletions
--- a/src/agent_loop.py
+++ b/src/agent_loop.py
@@ -276,7 +276,7 @@ Generate an image. Line 1 = description, line 2 = model name, line 3 = WxH (e.g.
    "manage_webhooks": "- ```manage_webhooks``` — Configure outgoing webhooks (HTTP notifications on events like chat completion). Args (JSON): {\"action\": \"list|add|delete|enable|disable\", ...}",
    "manage_tokens": "- ```manage_tokens``` — Generate or revoke API access tokens for external integrations. Args (JSON): {\"action\": \"list|create|delete\", ...}",
    "manage_documents": "- ```manage_documents``` — List, read/open, delete, or tidy documents in the editor panel. Args (JSON): {\"action\": \"list|read|delete|tidy\", ...}. `list` returns rows like `[Title](#document-<id>) — lang, size, updated 5m ago` sorted MOST-RECENT FIRST; the user clicks the anchor to open. `read` (aliases: view/open/get) takes `document_id` and returns the content. When the user asks \"open/show/read my notes\" or \"what documents do I have\", use this — do NOT shell out, do NOT curl.",
-    "manage_research": "- ```manage_research``` — List, read/open, or delete saved DEEP RESEARCH results from the Library. Args (JSON): {\"action\": \"list|read|delete\", \"id\": \"<id>\", \"search\": \"...\"}. `list` returns rows like `[query](#research-<id>) — N sources` MOST-RECENT FIRST; the user clicks to open. `read` (aliases: open/view/get) takes `id` and returns the report + sources. Use when the user says \"open/read/find/delete my research\" or \"that report\". To START new research, use trigger_research instead.",
+    "manage_research": "- ```manage_research``` — List, read/open, or delete saved DEEP RESEARCH results from the Library. Args (JSON): {\"action\": \"list|read|delete\", \"id\": \"<id>\", \"search\": \"...\"}. `list` returns rows like `[query](#research-<id>) — N sources` MOST-RECENT FIRST; the user clicks to open. `read` (aliases: open/view/get) takes `id` and returns the report text + sources. Use when the user says \"open/read/find/delete my research\" or \"that report\". This IS how you read a finished report: when the user refers to a just-completed deep-research job (\"check it out\", \"read that report\", \"summarize the research\") WITHOUT giving an id, call `manage_research` with `action:list` to get the most-recent id, then `action:read` with that id, and answer from the returned text. Do NOT `web_fetch`/`app_api` the `/api/research/report/{id}` URL — that endpoint renders HTML for the browser, not clean text — and do NOT start a fresh `web_search`/`trigger_research` just to read an existing report. To START new research, use trigger_research instead.",
    "manage_settings": "- ```manage_settings``` — View/change the REAL app settings (same ones the Settings panel writes) AND turn tools on/off. Change a setting: `{\"action\":\"set\",\"key\":\"...\",\"value\":\"...\"}` — keys accept friendly aliases, e.g. voice→tts_voice, \"search engine\"→search_provider, \"default model\"→default_model, \"teacher model\"→teacher_model, \"task/background model\"→task_model, \"image quality\"→image_quality, \"reminder channel\"→reminder_channel (browser|email|ntfy), \"agent timeout\"/\"max tool calls\"/\"token budget\". Read: `{\"action\":\"get\",\"key\":\"...\"}`; see all: `{\"action\":\"list\"}`; reset one: `{\"action\":\"reset\",\"key\":\"...\"}`. Use this when the user asks to change ANY preference instead of making them open Settings. Secrets/API keys are read-only (tell them to set those in the panel). Tool toggles: `{\"action\":\"disable_tool|enable_tool\",\"tool\":\"shell\"}` (aliases: shell/search/browser/documents/memory/skills/images/tasks/notes/calendar/email), list disabled: `{\"action\":\"list_tools\"}`.",
    "manage_notes": """\
 ```manage_notes
@@ -354,7 +354,7 @@ GENERIC LOOPBACK to ANY Odysseus internal endpoint. Use this whenever the user w
 - Sessions: `/api/sessions`, `/api/session/{id}`, `/api/session/{id}/truncate`
 - Themes: `/api/prefs/themes`, `/api/prefs/custom-themes`
 - Settings: `/api/settings`, `/api/prefs/{key}`
- Research: `/api/research/start`, `/api/research/tasks`, `/api/research/report/{id}`
+- Research: `/api/research/start`, `/api/research/tasks` (note: `/api/research/report/{id}` renders HTML — to READ a report's text use the `manage_research` tool with `action:read`, not this endpoint)
 - Compare: `/api/compare/sessions`, `/api/compare/start`
 - Email: use named email tools (`list_email_accounts`, `list_emails`, `read_email`, `send_email`, `reply_to_email`). Do NOT use `/api/email/accounts`; it is owner-filtered in tool context and may falsely return empty.
 - Endpoints (model providers): `/api/endpoints`, `/api/endpoints/{id}`
--- a/tests/test_research_report_read.py
+++ b/tests/test_research_report_read.py
@@ -0,0 +1,67 @@
 """Regression tests for issue #1363 — after a deep-research job finishes, asking
 the agent to "check it out / read that report" had it web_fetch the HTML report
 render (and drift into unrelated searches) instead of reading the saved report.
 Per the maintainer's diagnosis the fix is in the agent/tool-routing path: a
 finished report should be read via `manage_research` (action read), resolving the
 most-recent id with `action list` when none is given — not by fetching the
 `/api/research/report/{id}` HTML.
 These tests pin both halves:
 1. the read path the agent is told to use actually returns the report text for a
   saved `rp-...` id, and
 2. the agent instructions steer to `manage_research read` and away from
   web_fetching the HTML report.
 """
 import json
 from pathlib import Path
 import pytest
 from src.tool_implementations import do_manage_research
 from src.agent_loop import TOOL_SECTIONS
 _DATA_DIR = Path("data/deep_research")
@pytest.fixture
 def saved_report():
    _DATA_DIR.mkdir(parents=True, exist_ok=True)
    rid = "rp-testreport1363"
    path = _DATA_DIR / f"{rid}.json"
    path.write_text(json.dumps({
        "query": "trending blender video ideas",
        "result": "## Findings\nShort-form Geometry Nodes tutorials are trending.",
        "sources": [{"title": "Example", "url": "https://example.com"}],
        "completed_at": 123,
    }), encoding="utf-8")
    try:
        yield rid
    finally:
        path.unlink(missing_ok=True)
 async def test_manage_research_read_returns_report_text(saved_report):
    res = await do_manage_research(json.dumps({"action": "read", "id": saved_report}))
    out = res.get("output", "")
    # The agent must get the actual report body (not HTML, not an error).
    assert "Geometry Nodes tutorials are trending" in out
    assert "trending blender video ideas" in out
    assert res.get("exit_code") == 0
 async def test_panel_launched_rp_id_is_valid_for_read(saved_report):
    # rp-* ids (panel-launched research) contain a hyphen; the read path's id
    # guard must accept them, not reject them as invalid.
    res = await do_manage_research(json.dumps({"action": "read", "id": saved_report}))
    assert "error" not in res, res
 def test_instructions_route_report_reads_to_manage_research():
    desc = TOOL_SECTIONS["manage_research"]
    # Steers to the read tool for a finished report...
    assert "read that report" in desc.lower() or "that report" in desc.lower()
    assert "action:list" in desc or "action: list" in desc
    # ...and explicitly away from fetching the HTML report endpoint.
    assert "/api/research/report/" in desc
    assert "web_fetch" in desc.lower() or "app_api" in desc.lower()