Azure's RAG starter kit with a security obsession
Microsoft's reference architecture for deploying enterprise RAG without leaving your VNet.

What it does GPT-RAG is a solution accelerator—Bicep templates, deployment scripts, and wiring—for running Retrieval-Augmented Generation on Azure. It stitches together Azure AI Search, Azure OpenAI, Container Apps, Cosmos DB, and friends into a network-isolated stack. You get a ChatGPT-style interface that grounds answers in your own documents.
The interesting bit
The deployment rigor is the real product. A preflight script (Invoke-PreflightChecks.ps1) validates region capacity, model quota, SKU availability, and subscription drift before ARM ever runs. Network-isolated mode forces a two-hop deploy: workstation for azd provision, jumpbox for azd deploy. The README even warns which Azure API errors it cannot catch—unusual honesty in infrastructure docs.
Key highlights
- Zero-Trust by default: VNet isolation, least-privilege service communication, private endpoints
- AI Agent extensibility: NL2SQL and other context-aware workflows beyond simple Q&A
- Remote ACR builds in isolated environments—no Docker on the jumpbox required
- Preflight checks cover ~10 resource types plus OpenAI model quota per deployment entry
- Built on a landing-zone submodule (
bicep-ptn-aiml-landing-zone) for shared AI infrastructure patterns
Caveats
- The “AI Agent” claims are aspirational; the README mentions NL2SQL but offers no implementation detail
- Deployment complexity is nontrivial: multiple bypass flags, jumpbox requirements, and a deprecated
AZURE_ZERO_TRUSTvariable to avoid - No performance or cost benchmarks provided
Verdict Worth a look if you’re an Azure shop needing a vetted, compliance-friendly RAG baseline and have the platform team to absorb the Bicep sprawl. Skip it if you want a quick local prototype or aren’t already bought into Azure’s AI service constellation.