Provider Layer
The provider layer is the boundary between Umbra and any LLM API. It is built in three tiers: a registry of known provider types, a profiles store of your configured connections, and a gateway that routes requests and handles failure.
Provider types
Section titled “Provider types”Built-in types (always available):
| Type | Label | Default URL | Key required |
|---|---|---|---|
openai | OpenAI | https://api.openai.com/v1 | Yes |
anthropic | Anthropic | https://api.anthropic.com/v1 | Yes |
openrouter | OpenRouter | https://openrouter.ai/api/v1 | Yes |
mistral | Mistral | https://api.mistral.ai/v1 | Yes |
ollama | Ollama | http://127.0.0.1:11434/v1 | No |
lmstudio | LM Studio | http://127.0.0.1:1234/v1 | No |
openai-codex | ChatGPT Plus/Pro | https://chatgpt.com/backend-api | Optional (OAuth) |
openai_compatible | Custom endpoint | (set per profile) | Optional |
opencode-zen | OpenCode Zen | https://opencode.ai/zen/v1 | Optional |
Default for new users. On first launch, if no provider profile is configured, Umbra offers to connect OpenCode Zen automatically. It provides a set of free models with no API key required — enough to try the agent right away. You can switch to any other provider at any time via
umbra providers connect.
Profile store
Section titled “Profile store”Profiles are persisted in ~/.umbra/providers.json. Each profile stores:
type— one of the provider type values abovelabel— human-readable namebaseUrl— the API base URL (overridable per profile)apiKey— stored locally, never transmitted to Umbramodel— optional default model for this profileextraHeaders— arbitrary HTTP headers injected on every requestoptions— provider-specific options map
One profile is marked as defaultProfileId. The active profile for each task is resolved at runtime.
Provider gateway
Section titled “Provider gateway”DefaultProviderGateway is the single routing point for all LLM calls. It supports two routing modes:
By profile (profileId) — calls a single configured profile directly.
By chain (chainId) — iterates through an ordered list of profile entries; uses the first successful response. This enables automatic fallback across providers without any changes to the task code.
Request pipeline
Section titled “Request pipeline”GatewayRequest └─ #prepareRequest() # optional compression (off / standard / aggressive) ├─ compressToolOutput() # for tool result messages └─ condenseProse() # for user/assistant messages └─ #withRetries() # up to 2 attempts └─ catalog.completeProfile() / completeProfileStream() └─ #logResponse() # usage accounting + debug eventsRetry logic
Section titled “Retry logic”The gateway retries automatically on transient failures:
- HTTP 429 (rate limited)
- HTTP 5xx (server error)
AbortError(network timeout)fetch failed(connection refused / DNS)
Non-retryable errors (4xx other than 429, schema validation, unknown profile) are thrown immediately. The backoff between retries is 1 second × attempt number.
Request schema
Section titled “Request schema”Requests are typed and validated with Zod. Key fields:
| Field | Type | Notes |
|---|---|---|
model | string | Optional; profile default is used if omitted |
messages | ProviderChatMessage[] | Roles: system, user, assistant, tool |
tools | ProviderToolDefinition[] | Function-calling tools |
toolChoice | auto | required | none | |
responseFormat | text | json_object | json_schema | Structured output |
thinkBudget | number | low | medium | high | max | Extended reasoning token budget |
compressionLevel | off | standard | aggressive | Pre-request message compression |
Model capabilities registry
Section titled “Model capabilities registry”ModelsRegistry resolves model capabilities (context window, tool support, pricing, vision, reasoning, structured output) by fetching from models.dev/api.json with a 5-minute in-memory cache. If the model is not found there, it falls back to the HuggingFace model API, then to heuristic rules based on model name patterns.
Cost estimates (USD) are computed from actual token usage and the per-million pricing in the registry, and written to the usage log alongside each response.