Request Usage Meter
Usage tracking happens at two levels: persistent log (usage.jsonl) and in-session display.
Display modes
Section titled “Display modes”Configured via /usage in the TUI (opens an interactive menu) and persisted in runtime-preferences.json → usageDetailMode:
| Mode | What is shown after each response |
|---|---|
off (default) | Nothing |
compact | One line: ↑1234 ↓567 ~$0.0012 (input / output tokens / cost) |
verbose | Full block: model, route, context%, reasoning tokens, cache hit/write, cost |
Migrated from old boolean showUsageDetail: true → compact automatically on first load.
What compact mode shows
Section titled “What compact mode shows”↑ 4 218 ↓ 312 cache ✓ ~$0.0031 [claude-sonnet-4/anthropic]↑— input tokens↓— output tokenscache ✓— cache hit detected (cacheReadTokens > 0)~$0.00xx— cost estimate from ModelsRegistry pricing × actual tokens[model/provider]— route identifier
What verbose mode adds
Section titled “What verbose mode adds”- Context window fill percentage (
contextPercent) - Reasoning tokens (for extended thinking models)
- Cache write tokens (Anthropic prompt cache creation cost)
source: actual | estimated— whether figures came from provider or local estimator
Persistent log
Section titled “Persistent log”All records are written to ~/.umbra/usage.jsonl regardless of display mode. The log accumulates indefinitely and can be queried via UsageLogger.generateReport(), which prints a text table sorted by total cost per model and per provider.