← all repositories
Azure/GPT-RAG

Azure's RAG starter kit with a security obsession

Microsoft's reference architecture for deploying enterprise RAG without leaving your VNet.

1.2k stars Python RAG · SearchLLMOps · Eval
GPT-RAG
Velocity · 7d
+1.1
★ / day
Trend
steady
star history

What it does GPT-RAG is a solution accelerator—Bicep templates, deployment scripts, and wiring—for running Retrieval-Augmented Generation on Azure. It stitches together Azure AI Search, Azure OpenAI, Container Apps, Cosmos DB, and friends into a network-isolated stack. You get a ChatGPT-style interface that grounds answers in your own documents.

The interesting bit The deployment rigor is the real product. A preflight script (Invoke-PreflightChecks.ps1) validates region capacity, model quota, SKU availability, and subscription drift before ARM ever runs. Network-isolated mode forces a two-hop deploy: workstation for azd provision, jumpbox for azd deploy. The README even warns which Azure API errors it cannot catch—unusual honesty in infrastructure docs.

Key highlights

  • Zero-Trust by default: VNet isolation, least-privilege service communication, private endpoints
  • AI Agent extensibility: NL2SQL and other context-aware workflows beyond simple Q&A
  • Remote ACR builds in isolated environments—no Docker on the jumpbox required
  • Preflight checks cover ~10 resource types plus OpenAI model quota per deployment entry
  • Built on a landing-zone submodule (bicep-ptn-aiml-landing-zone) for shared AI infrastructure patterns

Caveats

  • The “AI Agent” claims are aspirational; the README mentions NL2SQL but offers no implementation detail
  • Deployment complexity is nontrivial: multiple bypass flags, jumpbox requirements, and a deprecated AZURE_ZERO_TRUST variable to avoid
  • No performance or cost benchmarks provided

Verdict Worth a look if you’re an Azure shop needing a vetted, compliance-friendly RAG baseline and have the platform team to absorb the Bicep sprawl. Skip it if you want a quick local prototype or aren’t already bought into Azure’s AI service constellation.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.