feat: round-limit handling — Continue affordance at the cap + configurable cap (#1999)

* feat: round-limit handling — Continue affordance at the cap + configurable cap When the agent loop runs out of rounds (per-message step cap, default 20) while still actively using tools, it stopped silently mid-task. Now: 1. The loop emits a `rounds_exhausted` SSE event at the cap, and the UI shows a "Continue" pill at the bottom of the chat that resumes the task from where it left off. Repeated cap-hits each get a fresh Continue (multiple continues in a row). 2. The cap is configurable in Settings → Agent ("Max steps per message"), validated on the client, at the save endpoint, and at the read site. - src/agent_loop.py: track `_exhausted_rounds` (set only when a full tool-executing round completes on the last allowed round — i.e. the agent wanted to keep going); emit `{"type":"rounds_exhausted","rounds":N}` (logged). - routes/chat_routes.py: read `agent_max_rounds` (clamped 1..200), pass as `max_rounds`; forward the new event through the SSE relay. - routes/auth_routes.py: validate numeric settings on save (int + clamp; agent_max_rounds 1..200, agent_max_tool_calls 0..1000; 400 on non-int). - src/settings.py: default `agent_max_rounds = 20`. - static/: Settings input + client-side clamp; the Continue pill (reuses the existing .stopped-indicator / .continue-btn classes and theme vars --border/--fg/--bg/--accent); appended to the chat container so it survives the message re-render at stream finalize. chat.js cache version bumped. * test: cover rounds_exhausted emission (cap-hit vs normal finish) Drives the real stream_agent_loop with mocked LLM stream / tool exec / settings: a tool block every round exhausts the cap and must emit rounds_exhausted; a plain answer hits the done-break and must not. Guards the for/else logic.
2026-06-04 22:36:05 +02:00
parent a54f41037d
commit 64d65b73c1
9 changed files with 215 additions and 14 deletions
--- a/src/agent_loop.py
+++ b/src/agent_loop.py
@@ -1643,6 +1643,11 @@ async def stream_agent_loop(
    _doc_opened = False    # whether doc_stream_open was sent
    _doc_last_len = 0      # last content length sent

+    # Set when the loop runs out of rounds while the agent was still actively
+    # using tools — i.e. it was cut off, not finished. Drives a "Continue" event
+    # so the user can resume instead of the turn silently stalling.
+    _exhausted_rounds = False
+
    for round_num in range(1, max_rounds + 1):
        round_response = ""
        round_reasoning = ""  # reasoning_content deltas (DeepSeek-thinking, vLLM --reasoning-parser)
@@ -2300,6 +2305,20 @@ async def stream_agent_loop(

        # Separator in accumulated response
        full_response += "\n\n"
+    else:
+        # The for-loop completed every allowed round WITHOUT an early `break`
+        # (a `break` fires on "done", budget, or error). Reaching this `else`
+        # means the agent kept working until it ran out of rounds — so offer
+        # Continue instead of stopping silently. This catches ALL exhaustion
+        # paths, including a verifier `continue` on the final round (the old
+        # bottom-of-loop flag missed those).
+        _exhausted_rounds = True
+
+    # If the loop hit the round cap while still working, tell the client so it
+    # can show a "Continue" affordance instead of the turn just stopping.
+    if _exhausted_rounds:
+        logger.info("[agent] round cap (%d) reached mid-task — emitting rounds_exhausted", max_rounds)
+        yield f'data: {json.dumps({"type": "rounds_exhausted", "rounds": max_rounds})}\n\n'

    # If the response is completely empty and no tools were executed,
    # yield a fallback message so the user is not left hanging.