Otter Providers
Providers — what otter can talk to
Section titled “Providers — what otter can talk to”chimera otter reuses Chimera’s standard provider stack
(chimera/providers/factory.py), so any provider that mink can drive,
otter can drive. The difference is in defaults: otter’s resolver
(chimera/otter/providers.py) prefers a hosted provider chain — Anthropic
first, then OpenRouter, then OpenAI — because the upstream open-source
agent it parallels is server-first and almost always run against a hosted
LLM.
This page is the otter-specific layer on top of the provider matrix.
For the deep, line-numbered tour of every adapter (Ollama internals,
Anthropic streaming, OpenAI delta accumulation, etc.) see
docs/mink/providers.md. The same code is in play.
Resolution order
Section titled “Resolution order”build_provider(args) (chimera/otter/providers.py:116) walks this chain
on every otter invocation, first match wins:
- Explicit
args.model(CLI--model <id>). $OTTER_MODELenvironment variable.$ANTHROPIC_API_KEYset → defaults toclaude-sonnet-4-6.$OPENROUTER_API_KEYset → defaults toanthropic/claude-sonnet-4, routed through the OpenAI-compatible adapter againstopenrouter.ai.$OPENAI_API_KEYset → defaults togpt-4o.- Friendly error pointing at the three env vars above.
Explicit beats env beats default. So chimera otter --model gpt-4o -p "..."
works even when $ANTHROPIC_API_KEY is set in the environment.
Anthropic (default)
Section titled “Anthropic (default)”The first-class otter target. claude-sonnet-4-6 is the default
because it’s the model the rest of the Chimera repo benchmarks against,
streams and tool-calls cleanly, and supports extended thinking + prompt
caching for long sessions.
- Setup:
Terminal window uv sync --extra anthropicexport ANTHROPIC_API_KEY=sk-ant-... - Use with otter:
Terminal window chimera otter -p "review this PR"chimera otter --model claude-opus-4 -p "long-form refactor" - What’s wired: streaming, tool calls, async, extended thinking,
prompt caching, vision. See
chimera/providers/anthropic.py. - Anthropic-compatible endpoints: the same provider also accepts
ANTHROPIC_BASE_URL+ANTHROPIC_AUTH_TOKEN, so you can route otter through GLM-4.6 (api.z.ai), Moonshot, or any third-party gateway that speaks the Messages API:Terminal window export ANTHROPIC_BASE_URL=https://api.z.ai/v1/anthropicexport ANTHROPIC_AUTH_TOKEN=...chimera otter --model glm-4.6 -p "..."
OpenAI
Section titled “OpenAI”Use when you have an OpenAI key and want gpt-4o, o3, or any GPT-5
class model otter recognizes.
- Setup:
Terminal window uv sync --extra openaiexport OPENAI_API_KEY=sk-... - Use with otter:
Terminal window chimera otter --model gpt-4o -p "draft a release note"chimera otter --model o3-mini -p "prove this invariant" - What’s wired: native streaming, tool calls, async (
AsyncOpenAI), reasoning-token tracking for o-series, prompt-cache hit accounting, vision (gpt-4o), JSON mode. Seechimera/providers/openai.py.
OpenRouter
Section titled “OpenRouter”OpenRouter is one of otter’s first-class targets because its vendor/name
model id convention (e.g. anthropic/claude-sonnet-4,
google/gemini-2.5-pro, meta-llama/llama-3.3-70b) lets otter swap
upstream brands behind a single API key.
Routing rule: when $OPENROUTER_API_KEY is set and the resolved
model id contains a /, otter hands it to the OpenAI-compatible adapter
pointed at https://openrouter.ai/api/v1
(chimera/otter/providers.py:96-113). A bare claude-sonnet-4-6 with
both $OPENROUTER_API_KEY and $ANTHROPIC_API_KEY set still goes
direct to Anthropic; the / separator is the explicit signal.
- Setup:
Terminal window export OPENROUTER_API_KEY=sk-or-... - Use with otter:
Terminal window chimera otter --model anthropic/claude-sonnet-4 -p "..."chimera otter --model google/gemini-2.5-pro -p "..."chimera otter --model meta-llama/llama-3.3-70b -p "..." - What’s wired: non-streaming
complete()plus base-class shim for streaming (one chunk at a time). Tool calls forwarded as standard OpenAI deltas. No provider-side caching.
Ollama (local + cloud tags)
Section titled “Ollama (local + cloud tags)”Identical to mink: same OllamaProvider, same /api/chat endpoint,
same keep_alive: 60m, same :cloud tag handling.
- Setup:
Terminal window brew install ollama # or curl -fsSL https://ollama.com/install.sh | shollama serve & # daemon on :11434ollama signin # only needed for :cloud tagsollama pull qwen3:32b # local fallback - Use with otter:
Terminal window chimera otter --model qwen3:32b -p "explain this repo"chimera otter --model glm-5.1:cloud -p "summarize"chimera otter --model kimi-k2.6:cloud -p "long-context refactor" - What’s wired: native streaming over NDJSON, tool calls (
tool_callsarray ondone:falsechunks),think:trueforkimi*tags, per-requestnum_ctx, configurableOLLAMA_HOSTfor remote daemons.
Modal-hosted vLLM container exposing OpenAI-shape /v1/chat/completions.
Otter inherits the same adapter as mink — useful when you’ve stood up an
open-weight model on Modal and want the otter loop to drive it.
- Setup:
Terminal window pip install modal httpxmodal token new - Use with otter:
--modelonly auto-routes Anthropic / OpenAI / OpenRouter / Ollama. To call Modal, build the provider in Python:from chimera.providers import create_providerprovider = create_provider("modal", model="meta-llama/Llama-3.3-70B",base_url="https://your-org--llm-app-serve.modal.run/v1",) - What’s wired: non-streaming
complete(), tool calls. No streaming, no async, no thinking, no caching.
Custom providers
Section titled “Custom providers”Implement Provider (chimera/providers/base.py) and call
register_provider("my-name", factory)
(chimera/providers/registry.py). Factory signature:
factory(model=..., api_key=..., base_url=..., **kw). After
registration, create_provider("my-name", model=...) works identically
to the built-ins.
To make otter pick your provider automatically, either:
- Hand it an explicit model with a recognizable prefix
(e.g. add
myco-*to_infer_provider), or - Construct the provider yourself and pass it to the otter agent
builder (the lower-level entry point used by
chimera otter serve):from chimera.otter.providers import build_provider# … build args with --model and the env keys you need …provider = build_provider(args)
The custom factory call can live in your package’s __init__.py so
import triggers self-registration, mirroring the built-ins (e.g.
chimera/providers/anthropic.py:436).
Choosing a provider
Section titled “Choosing a provider”| Concern | Pick |
|---|---|
| Default, “just works” | anthropic (claude-sonnet-4-6) |
| One key, many vendors | openrouter (anthropic/..., google/..., meta-llama/...) |
| Privacy / local | ollama with qwen3:32b or any tool-capable local tag |
| Cheap + fast | compatible against Groq / DeepSeek / Together |
| Vision-heavy | anthropic (claude) or openai (gpt-4o) |
| Long context (>200k) | google (Gemini 1M), anthropic (200k), Kimi (262k) |
| Reasoning-tuned | openai (o3, o3-mini) or anthropic (claude-opus + thinking) |
Mixing providers in one session
Section titled “Mixing providers in one session”Otter holds one provider per process. To swap mid-session, exit and
re-launch with a different --model or different env. The HTTP server
(chimera otter serve) likewise binds one provider per process; to fan
out across providers, run multiple servers on different ports.
The REPL /model slash command cycles through the model list passed
via --models <a>,<b>,<c> — it rebuilds the provider on each switch.
Auth storage
Section titled “Auth storage”Otter shares Chimera’s credential storage with mink and the rest of the CLI:
| Path | Source | Mode |
|---|---|---|
~/.chimera/credentials.json | OAuth-issued tokens, refresh tokens | 0o600 |
~/.chimera/auth.json | AuthManager.set_token() | default |
~/.opencode/config.json | Optional read-only config ingest | (untouched) |
CredentialStore._write chmods to 0o600 after each save
(chimera/auth/store.py:62).
Proxy mode for teams
Section titled “Proxy mode for teams”For an org-wide gateway sitting in front of Anthropic-shaped providers:
export ANTHROPIC_BASE_URL=http://proxy.internal:8000export ANTHROPIC_AUTH_TOKEN=team-issued-jwtchimera otter --model claude-sonnet-4-6 -p "..."AnthropicProvider honors both env vars; the proxy can be a thin
pass-through, a key-rotating relay, or a budget-enforcing gatekeeper.
The dedicated proxy provider (registered by
chimera/providers/proxy.py) is the alternative when your gateway
speaks its own JSON shape rather than the Anthropic wire protocol.
Troubleshooting
Section titled “Troubleshooting”| Symptom | Likely cause / fix |
|---|---|
otter: no provider configured | Set one of $ANTHROPIC_API_KEY, $OPENROUTER_API_KEY, $OPENAI_API_KEY, or pass --model <id> / $OTTER_MODEL. |
401 / 403 | Wrong key or wrong env var. printenv | grep -E '(ANTHROPIC|OPENAI|OPENROUTER)_' to verify. |
| OpenRouter not used despite key | Model id needs the vendor/name / separator. Bare claude-sonnet-4-6 routes to Anthropic. |
ImportError: pip install chimera-run[anthropic] | uv sync --extra anthropic (or openai, google) to pull the SDK. |
Cannot infer provider from model name '...' | Pass --model <id> with a known prefix (claude-*, gpt-*, gemini-*, glm-*, kimi-*, qwen*, …) or hardcode provider_type= via build_provider. |
| Streaming hangs on first call | Cloud cold start. Anthropic / OpenAI typically warm in <2s; Ollama Cloud needs keep_alive: 60m (otter sets this). |
tool_calls always empty on Ollama | You hit /v1/chat/completions instead of /api/chat. Set OLLAMA_HOST to the daemon root, not .../v1. |
temperature must be 1 with thinking | Anthropic extended thinking forces temperature=1. Drop your custom temperature kwarg. |
See also
Section titled “See also”models.md— concrete model id catalogue +OTTER_MODEL.docs/mink/providers.md— line-numbered deep dive into every provider adapter; the same code drives otter.server.md— provider holds across the HTTP server’s lifetime; one server, one provider.