Skip to content

Request Usage Meter

Usage tracking happens at two levels: persistent log (usage.jsonl) and in-session display.

Display modes

Configured via /usage in the TUI (opens an interactive menu) and persisted in runtime-preferences.json → usageDetailMode:

Mode	What is shown after each response
`off` (default)	Nothing
`compact`	One line: `↑1234 ↓567 ~$0.0012` (input / output tokens / cost)
`verbose`	Full block: model, route, context%, reasoning tokens, cache hit/write, cost

Migrated from old boolean showUsageDetail: true → compact automatically on first load.

What `compact` mode shows

↑ 4 218  ↓ 312  cache ✓  ~$0.0031  [claude-sonnet-4/anthropic]

↑ — input tokens
↓ — output tokens
cache ✓ — cache hit detected (cacheReadTokens > 0)
~$0.00xx — cost estimate from ModelsRegistry pricing × actual tokens
[model/provider] — route identifier

What `verbose` mode adds

Context window fill percentage (contextPercent)
Reasoning tokens (for extended thinking models)
Cache write tokens (Anthropic prompt cache creation cost)
source: actual | estimated — whether figures came from provider or local estimator

Persistent log

All records are written to ~/.umbra/usage.jsonl regardless of display mode. The log accumulates indefinitely and can be queried via UsageLogger.generateReport(), which prints a text table sorted by total cost per model and per provider.