fix(agent): make context-budget hard_max configurable via agent_input_token_hard_max setting (#1273)

Completes the reviewer requirement from PR #1190 review that was carried
over but not implemented in #1230:

> "The hard max is a function-local constant. For this setting, the ceiling
>  should be configurable or at least represented as a named setting/default
>  with tests."
                                                                — review on #1190

#1230 shipped the adaptive auto-derivation but left `DEFAULT_HARD_MAX = 200_000`
as a hardcoded module constant in src/context_budget.py. Admins on premium
APIs with large context windows (kimi-k2 / minimax-m3 at 1M, etc.) can use
their full window today only by setting `agent_input_token_budget`
explicitly — which then takes them off the adaptive auto-path entirely.

## What this PR changes

- src/settings.py: register `agent_input_token_hard_max` in
  DEFAULT_SETTINGS, default 200_000 (matches `DEFAULT_HARD_MAX`). Inline
  comment documents the no-op semantics in the explicit branch.

- src/agent_loop.py: read the setting at the call site and pass it as the
  `hard_max` kwarg of `compute_input_token_budget`. Defensive parsing —
  missing / non-int / zero values fall back to `DEFAULT_HARD_MAX`, so a
  misconfig cannot silently zero the budget.

- src/tool_implementations.py: three friendly aliases for `manage_settings`:
  - "hard max" -> agent_input_token_hard_max
  - "token budget cap" -> agent_input_token_hard_max
  - "input budget cap" -> agent_input_token_hard_max
  Plus the existing "token budget" -> agent_input_token_budget keeps a
  matching shorter alias "input budget".

- tests/test_context_budget.py: 6 new tests on top of the existing 6:
  - hard_max raises the auto ceiling (1M ctx + raised cap -> 85% of ctx)
  - hard_max lowers the auto ceiling (128K ctx + 50K cap -> 50K)
  - hard_max has no effect on the explicit branch
  - DEFAULT_SETTINGS contains the new key
  - manage_settings aliases are registered
  - the live get_setting path returns the override value, and malformed
    values fall back per the agent_loop defensive parsing

12 passed in 0.04s. No changes to the pure helper signature or semantics;
#1230's behavior is the default when the new setting is unset.

## How it lets users drop the explicit override

Before this PR, on a 1M-context model:
  agent_input_token_budget = 900_000  (explicit)  -> 900K  [user override]
  agent_input_token_budget = <unset>  (auto)      -> 200K  [HARD_MAX]

After this PR, same model:
  agent_input_token_budget = <unset>
  agent_input_token_hard_max = 900_000
                                      -> min(1M * 0.85, 900K) = 850K  [auto, no override needed]

The explicit-override path keeps working unchanged for users who prefer it.

This commit is contained in:

nickorlabs

2026-06-02 11:36:57 -05:00

committed by

GitHub

parent 3505a5ff27

commit c39d8db12a

4 changed files with 96 additions and 2 deletions

									
										5

src/tool_implementations.py
									
												View File
												
				@@ -1530,7 +1530,10 @@ async def do_manage_settings(content: str, owner: Optional[str] = None) -> Dict:

				            "ntfy topic": "reminder_ntfy_topic",

				            "agent tool calls": "agent_max_tool_calls", "max tool calls": "agent_max_tool_calls",

				            "agent timeout": "agent_stream_timeout_seconds", "stream timeout": "agent_stream_timeout_seconds",

				            "token budget": "agent_input_token_budget",

				            "token budget": "agent_input_token_budget", "input budget": "agent_input_token_budget",

				            "hard max": "agent_input_token_hard_max",

				            "token budget cap": "agent_input_token_hard_max",

				            "input budget cap": "agent_input_token_hard_max",

				        }

				        def _resolve(k):

				            k2 = (k or "").strip().lower()

fix(agent): make context-budget hard_max configurable via agent_input_token_hard_max setting (#1273)

5 src/tool_implementations.py Unescape Escape View File

5

src/tool_implementations.py

View File