fix(agent): extract web search sources from output key

tool_execution.py returns web search results as {"output": ..., "exit_code": 0}.
The sources-extraction block in stream_agent_loop only checked result.get("results")
and result.get("stdout"), so _src_text was always "" for every tool-call-mode web
search. Two consequences:

1. The SOURCES marker was never parsed and the web_sources SSE event was never
   emitted -- the sources panel never appeared after agent-mode searches.
2. The marker (a large JSON blob) was left in result["output"] and forwarded
   verbatim to the LLM in round 2 via format_tool_result, confusing some local
   models into producing no tokens.

Fix: prepend result.get("output") to the lookup chain, and update the cleanup
assignment so result["output"] is overwritten with the stripped text.

Adds six regression tests in tests/test_agent_loop.py documenting the before/after
behaviour and verifying backward compat with the legacy results/stdout paths.

Co-authored-by: MohammadYusif <MohammadYusif@users.noreply.github.com>
This commit is contained in:
MohammadYusif
2026-06-02 07:06:09 +03:00
committed by GitHub
parent d46c406bd8
commit 65b5d65059
2 changed files with 87 additions and 3 deletions

View File

@@ -2001,8 +2001,11 @@ async def stream_agent_loop(
)
desc, result = await _tool_task
# Extract structured web sources from web_search tool output
_src_text = result.get("results") or result.get("stdout") or ""
# Extract structured web sources from web_search tool output.
# web_search returns {"output": ..., "exit_code": 0}; check "output"
# first so the <!-- SOURCES:…--> marker is found and stripped even
# when the result doesn't carry a "results" or "stdout" key.
_src_text = result.get("output") or result.get("results") or result.get("stdout") or ""
if block.tool_type == "web_search" and _src_text:
_src_marker = "<!-- SOURCES:"
_src_idx = _src_text.find(_src_marker)
@@ -2014,7 +2017,9 @@ async def stream_agent_loop(
yield f'data: {json.dumps({"type": "web_sources", "data": _extracted_sources})}\n\n'
# Strip the marker from the result so it doesn't show in chat
_clean = _src_text[:_src_idx].rstrip()
if "results" in result:
if "output" in result:
result["output"] = _clean
elif "results" in result:
result["results"] = _clean
elif "stdout" in result:
result["stdout"] = _clean