← all repositories
bytedance/deer-flow

ByteDance's agent harness wants to do your multi-hour homework

DeerFlow 2.0 orchestrates sub-agents, sandboxes, and memory to run research and coding tasks that take minutes to hours.

70.7k stars Python Agents
deer-flow
Velocity · 7d
+178
★ / day
Trend
steady
star history

What it does

DeerFlow is a “super agent harness” — a Python/Node.js framework that chains together sub-agents, long-term memory, sandboxed execution, and extensible skills to tackle long-horizon tasks. Think deep research, code generation, or report writing that needs to run unsupervised for hours, not seconds. It grew out of a deep-research tool and was rewritten from scratch for 2.0.

The interesting bit

The project treats setup as an agentic problem too: you can literally hand a one-line prompt to Claude Code or Cursor and have that agent bootstrap DeerFlow for you. The README also publishes honest deployment sizing tables — 4 vCPU/8 GB for local eval, 16 vCPU/32 GB for serious shared use — which is rarer than it should be in this space.

Key highlights

  • Extensive model support: OpenAI, Anthropic, OpenRouter, vLLM, plus CLI-backed providers like Codex CLI and Claude Code OAuth, all configured via YAML.
  • Sandboxed execution: Docker-based sandboxes with optional provisioner mode for isolated tool use and file system access.
  • Memory and context engineering: Built-in long-term memory and context management across multi-step agent chains.
  • Observability hooks: Ready for LangSmith and Langfuse tracing out of the box.
  • ByteDance integration: Bundles InfoQuest search/crawling and promotes Doubao-Seed-2.0-Code, DeepSeek v3.2, and Kimi 2.5 as recommended models.

Caveats

  • Security notice is prominent: The README warns that “improper deployment may introduce security risks” — sandboxed or not, this runs arbitrary code and file writes.
  • Resource hungry: Even the minimum local setup wants 4 vCPU and 8 GB RAM; 2 vCPU/4 GB “is usually not enough.”
  • v2 is a clean break: No code shared with 1.x; if you were using the earlier deep-research branch, you’re effectively migrating.

Verdict

Worth a look if you’re building autonomous research or coding pipelines that need to run for hours and manage their own state. Skip it if you just need a quick LLM wrapper — this is infrastructure, not a one-liner API call.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.