feat: round-limit handling — Continue affordance at the cap + configurable cap (#1999)
* feat: round-limit handling — Continue affordance at the cap + configurable cap
When the agent loop runs out of rounds (per-message step cap, default 20)
while still actively using tools, it stopped silently mid-task. Now:
1. The loop emits a `rounds_exhausted` SSE event at the cap, and the UI shows
a "Continue" pill at the bottom of the chat that resumes the task from where
it left off. Repeated cap-hits each get a fresh Continue (multiple continues
in a row).
2. The cap is configurable in Settings → Agent ("Max steps per message"),
validated on the client, at the save endpoint, and at the read site.
- src/agent_loop.py: track `_exhausted_rounds` (set only when a full
tool-executing round completes on the last allowed round — i.e. the agent
wanted to keep going); emit `{"type":"rounds_exhausted","rounds":N}` (logged).
- routes/chat_routes.py: read `agent_max_rounds` (clamped 1..200), pass as
`max_rounds`; forward the new event through the SSE relay.
- routes/auth_routes.py: validate numeric settings on save (int + clamp;
agent_max_rounds 1..200, agent_max_tool_calls 0..1000; 400 on non-int).
- src/settings.py: default `agent_max_rounds = 20`.
- static/: Settings input + client-side clamp; the Continue pill (reuses the
existing .stopped-indicator / .continue-btn classes and theme vars
--border/--fg/--bg/--accent); appended to the chat container so it survives
the message re-render at stream finalize. chat.js cache version bumped.
* test: cover rounds_exhausted emission (cap-hit vs normal finish)
Drives the real stream_agent_loop with mocked LLM stream / tool exec / settings:
a tool block every round exhausts the cap and must emit rounds_exhausted; a
plain answer hits the done-break and must not. Guards the for/else logic.
This commit is contained in:
committed by
GitHub
parent
a54f41037d
commit
64d65b73c1
@@ -1643,6 +1643,11 @@ async def stream_agent_loop(
|
||||
_doc_opened = False # whether doc_stream_open was sent
|
||||
_doc_last_len = 0 # last content length sent
|
||||
|
||||
# Set when the loop runs out of rounds while the agent was still actively
|
||||
# using tools — i.e. it was cut off, not finished. Drives a "Continue" event
|
||||
# so the user can resume instead of the turn silently stalling.
|
||||
_exhausted_rounds = False
|
||||
|
||||
for round_num in range(1, max_rounds + 1):
|
||||
round_response = ""
|
||||
round_reasoning = "" # reasoning_content deltas (DeepSeek-thinking, vLLM --reasoning-parser)
|
||||
@@ -2300,6 +2305,20 @@ async def stream_agent_loop(
|
||||
|
||||
# Separator in accumulated response
|
||||
full_response += "\n\n"
|
||||
else:
|
||||
# The for-loop completed every allowed round WITHOUT an early `break`
|
||||
# (a `break` fires on "done", budget, or error). Reaching this `else`
|
||||
# means the agent kept working until it ran out of rounds — so offer
|
||||
# Continue instead of stopping silently. This catches ALL exhaustion
|
||||
# paths, including a verifier `continue` on the final round (the old
|
||||
# bottom-of-loop flag missed those).
|
||||
_exhausted_rounds = True
|
||||
|
||||
# If the loop hit the round cap while still working, tell the client so it
|
||||
# can show a "Continue" affordance instead of the turn just stopping.
|
||||
if _exhausted_rounds:
|
||||
logger.info("[agent] round cap (%d) reached mid-task — emitting rounds_exhausted", max_rounds)
|
||||
yield f'data: {json.dumps({"type": "rounds_exhausted", "rounds": max_rounds})}\n\n'
|
||||
|
||||
# If the response is completely empty and no tools were executed,
|
||||
# yield a fallback message so the user is not left hanging.
|
||||
|
||||
Reference in New Issue
Block a user