← all repositories
microsoft/data-formulator

Microsoft's AI data viz tool skips the code wall

An open-source canvas where analysts ask questions in plain English and get editable, branchable charts instead of unreadable agent output.

15.8k stars TypeScript Other AI
data-formulator
Velocity · 7d
+22
★ / day
Trend
steady
star history

What it does Data Formulator is a browser-based data exploration tool that connects to databases, warehouses, files, and even screenshots, then lets you query and visualize through natural language. It runs locally via Python (uvx data_formulator or pip) and opens at localhost:5567. The latest v0.7 adds persistent workspaces, a semantic chart engine with 30+ chart types, and report export to image or PDF.

The interesting bit The “Data Thread” keeps your exploration navigable: you can revisit earlier steps, branch into alternative analyses, and compare side by side. That’s the hard part of AI-assisted analysis — not generating one chart, but not losing your place when you iterate. The style-refinement agent also lets you polish rough charts through natural language rather than fiddling with Vega-Lite by hand.

Key highlights

  • Connectors for Superset, Kusto, Cosmos DB, MySQL, PostgreSQL, MSSQL, BigQuery, S3, Azure Blob, and a plugin system for custom sources
  • Data extraction from images, text, websites, and Excel via a loading agent
  • Sandboxed code execution with thread memory for context-aware exploration
  • Multilingual UI (English and Chinese, more via contributions)
  • Model-agnostic: OpenAI, Azure, Ollama, Anthropic through LiteLLM

Caveats

  • Requires bringing your own LLM API keys; no built-in model
  • v0.7 is fresh (May 2026), so the connector and workspace features are still settling
  • Microsoft CLA required for contributions

Verdict Worth a spin for data teams tired of pasting between Jupyter, BI tools, and ChatGPT. Not for you if you need fully governed enterprise permissions or prefer writing D3 by hand.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.