feat: round-limit handling — Continue affordance at the cap + configurable cap (#1999)

* feat: round-limit handling — Continue affordance at the cap + configurable cap

When the agent loop runs out of rounds (per-message step cap, default 20)
while still actively using tools, it stopped silently mid-task. Now:

1. The loop emits a `rounds_exhausted` SSE event at the cap, and the UI shows
   a "Continue" pill at the bottom of the chat that resumes the task from where
   it left off. Repeated cap-hits each get a fresh Continue (multiple continues
   in a row).
2. The cap is configurable in Settings → Agent ("Max steps per message"),
   validated on the client, at the save endpoint, and at the read site.

- src/agent_loop.py: track `_exhausted_rounds` (set only when a full
  tool-executing round completes on the last allowed round — i.e. the agent
  wanted to keep going); emit `{"type":"rounds_exhausted","rounds":N}` (logged).
- routes/chat_routes.py: read `agent_max_rounds` (clamped 1..200), pass as
  `max_rounds`; forward the new event through the SSE relay.
- routes/auth_routes.py: validate numeric settings on save (int + clamp;
  agent_max_rounds 1..200, agent_max_tool_calls 0..1000; 400 on non-int).
- src/settings.py: default `agent_max_rounds = 20`.
- static/: Settings input + client-side clamp; the Continue pill (reuses the
  existing .stopped-indicator / .continue-btn classes and theme vars
  --border/--fg/--bg/--accent); appended to the chat container so it survives
  the message re-render at stream finalize. chat.js cache version bumped.

* test: cover rounds_exhausted emission (cap-hit vs normal finish)

Drives the real stream_agent_loop with mocked LLM stream / tool exec / settings:
a tool block every round exhausts the cap and must emit rounds_exhausted; a
plain answer hits the done-break and must not. Guards the for/else logic.
This commit is contained in:
Kenny Van de Maele
2026-06-04 22:36:05 +02:00
committed by GitHub
parent a54f41037d
commit 64d65b73c1
9 changed files with 215 additions and 14 deletions

View File

@@ -1643,6 +1643,11 @@ async def stream_agent_loop(
_doc_opened = False # whether doc_stream_open was sent
_doc_last_len = 0 # last content length sent
# Set when the loop runs out of rounds while the agent was still actively
# using tools — i.e. it was cut off, not finished. Drives a "Continue" event
# so the user can resume instead of the turn silently stalling.
_exhausted_rounds = False
for round_num in range(1, max_rounds + 1):
round_response = ""
round_reasoning = "" # reasoning_content deltas (DeepSeek-thinking, vLLM --reasoning-parser)
@@ -2300,6 +2305,20 @@ async def stream_agent_loop(
# Separator in accumulated response
full_response += "\n\n"
else:
# The for-loop completed every allowed round WITHOUT an early `break`
# (a `break` fires on "done", budget, or error). Reaching this `else`
# means the agent kept working until it ran out of rounds — so offer
# Continue instead of stopping silently. This catches ALL exhaustion
# paths, including a verifier `continue` on the final round (the old
# bottom-of-loop flag missed those).
_exhausted_rounds = True
# If the loop hit the round cap while still working, tell the client so it
# can show a "Continue" affordance instead of the turn just stopping.
if _exhausted_rounds:
logger.info("[agent] round cap (%d) reached mid-task — emitting rounds_exhausted", max_rounds)
yield f'data: {json.dumps({"type": "rounds_exhausted", "rounds": max_rounds})}\n\n'
# If the response is completely empty and no tools were executed,
# yield a fallback message so the user is not left hanging.