← all repositories
run-llama/create-llama

LlamaIndex's create-next-app moment

A CLI that scaffolds full-stack RAG apps in either Next.js or Python FastAPI, because wiring embeddings to a chat UI is tedious.

1.5k stars Python App BuildersAgentsRAG · Search
create-llama
Velocity · 7d
+1.8
★ / day
Trend
steady
star history

What it does create-llama is a CLI tool that generates a working LlamaIndex application from a few prompts. Pick a use case (Agentic RAG, Data Analysis, Report Generation), choose Next.js or Python FastAPI, and it spits out a pre-wired project with a shadcn/ui chat frontend, LlamaIndex Server backend, and a data/ folder ready for your files.

The interesting bit The dual-stack approach is unusual: the same CLI can scaffold either a TypeScript/Vercel stack or a Python/FastAPI stack, with the Python backend notably ingesting video and audio files that the TS version can’t yet handle. It’s essentially create-next-app translated to the RAG era — opinionated glue that saves you from wiring OpenAI embeddings to a chat interface by hand.

Key highlights

  • Ships with pre-built use cases: Agentic RAG, Data Analysis, Report Generation
  • Frontend uses shadcn/ui components in a chat-interface layout
  • Next.js option deploys to Vercel in “a few clicks”; Python option uses FastAPI
  • Data ingestion: drop files in data/, run npm run generate (TS) or uv run generate (Python)
  • Defaults to OpenAI’s gpt-4.1 and text-embedding-3-large, but supports “dozens of other LLMs” via manual config edits
  • Non-interactive mode available with CLI flags for scripting

Caveats

  • Switching from OpenAI defaults requires manually editing settings.ts or settings.py; the --ask-models flag exists but the README doesn’t fully clarify how it interacts with manual edits
  • Adding new data files requires re-running the generate command and restarting — no live reloading mentioned
  • LlamaCloud integration is offered but the API key prompt suggests it’s optional; unclear what breaks if you skip it

Verdict Worth a spin if you want to prototype a LlamaIndex app without drowning in boilerplate. Skip it if you already have strong opinions about your stack — this tool is deliberately opinionated, and you’ll spend time unwrapping its choices if they don’t match yours.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.