Firecrawl open-sources its web-research stack
The hosted agent you can pay for, now forkable and model-swappable.

What it does
This repo is the open-source foundation behind Firecrawl’s hosted research agent. It gives you a layered toolkit for building autonomous web-research bots: Next.js and Express templates on top, an orchestration core in the middle, and Firecrawl’s search/scrape/interact tools at the bottom. You scaffold via CLI, swap in your own models, and deploy where you like.
The interesting bit
The architecture is deliberately stacked like a diner menu — start with a full Next.js app or peel down to raw primitives. The orchestration rides on LangChain’s Deep Agents for the plan-act loop and parallel sub-agent spawning, which saves reinventing the agent harness. Skills are just markdown files auto-discovered from disk, loaded on demand by middleware. It’s a pragmatic glue job rather than a from-scratch framework.
Key highlights
- Layered stack: Next.js template → Express template → Agent Core library → AI SDK → base SDK → REST API
- Parallel subagents: Independent workers with isolated browser sessions, spawned via Deep Agents’
tasktool - Skills as markdown: Reusable
SKILL.mdplaybooks inagent-core/src/skills/definitions/, auto-discovered and loaded on demand - Structured output: JSON formatting plus
bashExecdata processing via Vercel’sjust-bash - Streaming support: Built into the Next.js template and core examples
Caveats
- The hosted “Spark 1” models are proprietary; you bring your own LLM to the open-source version
- Documentation is thin — the README points outward to docs.firecrawl.dev and npm packages for most layers
- 1,112 stars suggests early traction, but real-world durability at scale is unproven in the sources
Verdict
Worth a look if you’re building research agents and want a head start on orchestration, web tools, and deployment patterns without marrying a closed platform. Skip it if you need a fully documented, batteries-included framework with training wheels attached.