Skip to content

In-Turn Split-Logic (Split-Turn Compression)

applySplitTurn() (src/context/split-turn.ts) handles a specific failure mode: a single agent turn accumulates many tool calls (read → search → edit → run → read again …) and the growing message window overflows the context budget before the turn is finished.

The function activates only when all three conditions are true:

  1. estimateJsonTokens(messages) > maxTokens — the window is over budget
  2. There is at least one tool-call/result pair in the current turn (after the last user message)
  3. The number of pairs in the current turn exceeds SPLIT_TURN_TAIL_SIZE (default: 3)
Before:
[system] [history...] [user: "task"] [asst+tools₁] [results₁] ... [asst+toolsₙ] [resultsₙ]
After:
[system] [history trimmed if still over budget...]
[user: "task"]
[user: "[Turn prefix compressed — N earlier tool call(s)]
## In-Turn Prefix
- fs.read(path=src/foo.ts) → completed
- search.rg(pattern=foo) → completed
..."]
[asst+toolsₙ₋₂] [resultsₙ₋₂] ← tail: last 3 pairs kept raw
[asst+toolsₙ₋₁] [resultsₙ₋₁]
[asst+toolsₙ] [resultsₙ]

The prefix (all but the last 3 pairs) is replaced by a compact summary listing tool name + key arg + status. The tail (last 3 pairs) is kept verbatim — these are the most recent actions and are likely still needed for the next model step.

If the window is still over budget after the split, old conversation history (before the current user message) is dropped from the front, one message at a time, until the budget is met.

  • All file:line references from the prefix are described in the summary (tool name + first two args truncated to 40 chars + status)
  • The 3 most recent tool pairs are untouched — full JSON, including error output
  • The user’s original task message is always kept
  • Messages fit in budget → returned unchanged
  • No active tool exchange detected → returned unchanged
  • Pairs ≤ tail size → returned unchanged (nothing to compress)