feat: round-limit handling — Continue affordance at the cap + configurable cap (#1999)

* feat: round-limit handling — Continue affordance at the cap + configurable cap

When the agent loop runs out of rounds (per-message step cap, default 20)
while still actively using tools, it stopped silently mid-task. Now:

1. The loop emits a `rounds_exhausted` SSE event at the cap, and the UI shows
   a "Continue" pill at the bottom of the chat that resumes the task from where
   it left off. Repeated cap-hits each get a fresh Continue (multiple continues
   in a row).
2. The cap is configurable in Settings → Agent ("Max steps per message"),
   validated on the client, at the save endpoint, and at the read site.

- src/agent_loop.py: track `_exhausted_rounds` (set only when a full
  tool-executing round completes on the last allowed round — i.e. the agent
  wanted to keep going); emit `{"type":"rounds_exhausted","rounds":N}` (logged).
- routes/chat_routes.py: read `agent_max_rounds` (clamped 1..200), pass as
  `max_rounds`; forward the new event through the SSE relay.
- routes/auth_routes.py: validate numeric settings on save (int + clamp;
  agent_max_rounds 1..200, agent_max_tool_calls 0..1000; 400 on non-int).
- src/settings.py: default `agent_max_rounds = 20`.
- static/: Settings input + client-side clamp; the Continue pill (reuses the
  existing .stopped-indicator / .continue-btn classes and theme vars
  --border/--fg/--bg/--accent); appended to the chat container so it survives
  the message re-render at stream finalize. chat.js cache version bumped.

* test: cover rounds_exhausted emission (cap-hit vs normal finish)

Drives the real stream_agent_loop with mocked LLM stream / tool exec / settings:
a tool block every round exhausts the cap and must emit rounds_exhausted; a
plain answer hits the done-break and must not. Guards the for/else logic.
This commit is contained in:
Kenny Van de Maele
2026-06-04 22:36:05 +02:00
committed by GitHub
parent a54f41037d
commit 64d65b73c1
9 changed files with 215 additions and 14 deletions

View File

@@ -0,0 +1,70 @@
"""Regression: stream_agent_loop emits `rounds_exhausted` only when the round
cap is hit while still working, and NOT on a normal finish.
The decision is a `for/else` in the loop: the `else` runs only if no `break`
fired (break = done / budget / error). A refactor that adds a stray break or
return, or moves the done-break, could silently flip this. See PR #1999 / #1997.
"""
import asyncio
import json
import src.agent_loop as al
def _collect(gen):
async def _run():
return [c async for c in gen]
return asyncio.run(_run())
def _types(chunks):
out = []
for c in chunks:
if c.startswith("data: ") and not c.startswith("data: [DONE]"):
try:
out.append(json.loads(c[6:]))
except Exception:
pass
return out
def _patch_common(monkeypatch):
# Skip RAG/tool-index, MCP, and settings lookups; keep the real loop body,
# _resolve_tool_blocks, and parse_tool_blocks.
monkeypatch.setattr(al, "get_setting", lambda key, default=None: default, raising=False)
monkeypatch.setattr(al, "get_mcp_manager", lambda: None, raising=False)
monkeypatch.setattr(al, "estimate_tokens", lambda *a, **k: 10, raising=False)
async def _fake_exec(block, *a, **k):
return ("bash", {"output": "ok", "exit_code": 0})
monkeypatch.setattr(al, "execute_tool_block", _fake_exec, raising=False)
def _run_loop(monkeypatch, round_text, max_rounds=2):
async def _fake_stream(_candidates, messages, **kwargs):
yield f'data: {json.dumps({"delta": round_text})}\n\n'
yield "data: [DONE]\n\n"
monkeypatch.setattr(al, "stream_llm_with_fallback", _fake_stream, raising=False)
gen = al.stream_agent_loop(
"http://x/v1", "m",
[{"role": "user", "content": "do a long multi-step task"}],
max_rounds=max_rounds,
relevant_tools={"bash"},
)
return _types(_collect(gen))
def test_emits_rounds_exhausted_when_cap_hit_mid_task(monkeypatch):
_patch_common(monkeypatch)
# Every round returns a tool block -> never "done" -> loop exhausts the cap.
events = _run_loop(monkeypatch, "```bash\necho hi\n```", max_rounds=2)
assert any(e.get("type") == "rounds_exhausted" for e in events), events
def test_no_rounds_exhausted_on_normal_finish(monkeypatch):
_patch_common(monkeypatch)
# A plain answer (no tool block) -> done-break on round 1 -> no event.
events = _run_loop(monkeypatch, "All done, here is your answer.", max_rounds=2)
assert not any(e.get("type") == "rounds_exhausted" for e in events), events