Use DeepSeek-V4
DeepSeek-V4 is wired into Chimera’s model catalog with three transports:
| Model id | Transport | Endpoint |
|---|---|---|
deepseek-v4 | OpenAI-compatible | https://api.deepseek.com/v1 |
deepseek-v4-pro | OpenAI-compatible | https://api.deepseek.com/v1 |
deepseek-v4-pro:cloud | Ollama | $OLLAMA_HOST (cloud passthrough) |
The bare ids hit DeepSeek’s hosted OpenAI-compatible endpoint. The :cloud-tagged id hits Ollama’s ollama run deepseek-v4-pro:cloud cloud passthrough — useful when you already have an Ollama signin and prefer the unified endpoint. You can also pull the model into a local Ollama daemon for fully offline use.
Context window: 128k tokens for every variant.
Prerequisites
Section titled “Prerequisites”Pick one of the three paths below.
Direct API
Section titled “Direct API”- A DeepSeek API key — api-docs.deepseek.com
export DEEPSEEK_API_KEY=sk-...
Ollama cloud
Section titled “Ollama cloud”- Ollama 0.6+ — ollama.com/download
- An Ollama account:
ollama signin export OLLAMA_HOST=https://ollama.com(default)
Local Ollama
Section titled “Local Ollama”- Ollama 0.6+
- Sufficient RAM and storage to host the model locally —
ollama pull deepseek-v4-profirst
All three paths require Chimera installed:
pip install chimera-runQuickstart: direct API
Section titled “Quickstart: direct API”import osfrom chimera.providers.factory import create_provider
os.environ["DEEPSEEK_API_KEY"] = "sk-..."
provider = create_provider(model="deepseek-v4-pro")response = provider.complete("Write a Python one-liner that reverses a list.")print(response.content)The factory resolves deepseek-v4 / deepseek-v4-pro / deepseek-chat / deepseek-reasoner to the OpenAI-compatible provider pointed at https://api.deepseek.com/v1 with the DEEPSEEK_API_KEY env var.
Quickstart: Ollama cloud passthrough
Section titled “Quickstart: Ollama cloud passthrough”ollama signinexport OLLAMA_HOST=https://ollama.comfrom chimera.providers.factory import create_provider
provider = create_provider(model="deepseek-v4-pro:cloud")response = provider.complete("Refactor this function for clarity.")print(response.content)The :cloud suffix forces the Ollama transport regardless of any prefix matching elsewhere in the catalog. The factory inspects the trailing :cloud (and friends like glm-5.1:cloud, kimi-k2.6:cloud) to short-circuit to the Ollama provider.
Quickstart: local Ollama
Section titled “Quickstart: local Ollama”ollama pull deepseek-v4-proollama serve # if not already running on :11434import osfrom chimera.providers.factory import create_provider
os.environ["OLLAMA_HOST"] = "http://localhost:11434"
provider = create_provider(model="deepseek-v4-pro")# When OLLAMA_HOST points at a local daemon, the factory will route through# the local Ollama transport instead of the public DeepSeek API.response = provider.complete("Implement a binary search.")print(response.content)For full offline operation, drop DEEPSEEK_API_KEY from your environment so the factory cannot accidentally fall back to the hosted API.
Driving a coding agent
Section titled “Driving a coding agent”import asynciofrom chimera.assembly.coding_agent import CodingAgent
agent = CodingAgent(model="deepseek-v4-pro")
async def main(): result = await agent.arun( "Add a CLI flag --json to scripts/format_report.py and write a test." ) print(result.output)
asyncio.run(main())CodingAgent wires up the default tools, a ReAct loop, and the right environment for the workspace. The model= kwarg flows through create_provider.
CLI examples
Section titled “CLI examples”# Synthesize against the direct APIchimera synthesize "Build a calculator REST API" --tests tests/ --model deepseek-v4-pro
# Eval HumanEval against the cloud passthroughchimera eval --benchmark humaneval --model deepseek-v4-pro:cloud --limit 10
# Otter REPL against the direct APIotter chat --model deepseek-v4
# Code REPL with model cycling: try DeepSeek-V4 first, then fall back to GLM-5chimera code --models deepseek-v4-pro,glm-5Pricing notes
Section titled “Pricing notes”The cost catalog ships placeholder pricing copied from deepseek-reasoner ($0.55 / $2.19 per Mtok) for every V4 variant. DeepSeek had not published a V4 list at the time of writing — refresh chimera/providers/cost.py once the official rate card lands.
Troubleshooting
Section titled “Troubleshooting”No API key found for deepseek-v4-pro— setDEEPSEEK_API_KEY(or use the:cloudvariant for Ollama).OLLAMA_HOST is unreachable— startollama serve, or setOLLAMA_HOST=https://ollama.comfor cloud.Model 'deepseek-v4-foo' not in catalog— onlydeepseek-v4,deepseek-v4-pro,deepseek-v4-pro:cloudare wired. Custom variants need aModelConfigregistered againstchimera.providers.catalog.- Slow context-window saturation — every V4 variant ships at 128k context. If you’re hitting it on long sessions, enable
chimera.compaction.thresholds.ThresholdCompactionin yourLoopConfig.
See also
Section titled “See also”- Use with Ollama — broader Ollama setup, including local Qwen and other cloud models.
- Use with third-party providers — the general OpenAI-compatible recipe DeepSeek follows.