ByteDance's agent harness wants to do your multi-hour homework
DeerFlow 2.0 orchestrates sub-agents, sandboxes, and memory to run research and coding tasks that take minutes to hours.

What it does
DeerFlow is a “super agent harness” — a Python/Node.js framework that chains together sub-agents, long-term memory, sandboxed execution, and extensible skills to tackle long-horizon tasks. Think deep research, code generation, or report writing that needs to run unsupervised for hours, not seconds. It grew out of a deep-research tool and was rewritten from scratch for 2.0.
The interesting bit
The project treats setup as an agentic problem too: you can literally hand a one-line prompt to Claude Code or Cursor and have that agent bootstrap DeerFlow for you. The README also publishes honest deployment sizing tables — 4 vCPU/8 GB for local eval, 16 vCPU/32 GB for serious shared use — which is rarer than it should be in this space.
Key highlights
- Extensive model support: OpenAI, Anthropic, OpenRouter, vLLM, plus CLI-backed providers like Codex CLI and Claude Code OAuth, all configured via YAML.
- Sandboxed execution: Docker-based sandboxes with optional provisioner mode for isolated tool use and file system access.
- Memory and context engineering: Built-in long-term memory and context management across multi-step agent chains.
- Observability hooks: Ready for LangSmith and Langfuse tracing out of the box.
- ByteDance integration: Bundles InfoQuest search/crawling and promotes Doubao-Seed-2.0-Code, DeepSeek v3.2, and Kimi 2.5 as recommended models.
Caveats
- Security notice is prominent: The README warns that “improper deployment may introduce security risks” — sandboxed or not, this runs arbitrary code and file writes.
- Resource hungry: Even the minimum local setup wants 4 vCPU and 8 GB RAM; 2 vCPU/4 GB “is usually not enough.”
- v2 is a clean break: No code shared with 1.x; if you were using the earlier deep-research branch, you’re effectively migrating.
Verdict
Worth a look if you’re building autonomous research or coding pipelines that need to run for hours and manage their own state. Skip it if you just need a quick LLM wrapper — this is infrastructure, not a one-liner API call.