Skip to content

Request Usage Meter

Usage tracking happens at two levels: persistent log (usage.jsonl) and in-session display.

Configured via /usage in the TUI (opens an interactive menu) and persisted in runtime-preferences.json → usageDetailMode:

ModeWhat is shown after each response
off (default)Nothing
compactOne line: ↑1234 ↓567 ~$0.0012 (input / output tokens / cost)
verboseFull block: model, route, context%, reasoning tokens, cache hit/write, cost

Migrated from old boolean showUsageDetail: truecompact automatically on first load.

↑ 4 218 ↓ 312 cache ✓ ~$0.0031 [claude-sonnet-4/anthropic]
  • — input tokens
  • — output tokens
  • cache ✓ — cache hit detected (cacheReadTokens > 0)
  • ~$0.00xx — cost estimate from ModelsRegistry pricing × actual tokens
  • [model/provider] — route identifier
  • Context window fill percentage (contextPercent)
  • Reasoning tokens (for extended thinking models)
  • Cache write tokens (Anthropic prompt cache creation cost)
  • source: actual | estimated — whether figures came from provider or local estimator

All records are written to ~/.umbra/usage.jsonl regardless of display mode. The log accumulates indefinitely and can be queried via UsageLogger.generateReport(), which prints a text table sorted by total cost per model and per provider.