← all repositories
salesforce/WikiSQL

The dataset that taught machines to talk to databases

A crowd-sourced benchmark for turning natural language into SQL, with a leaderboard that tracks how close we've come to replacing database admins with chatbots.

1.8k stars HTML Data ToolingLanguage Models
WikiSQL
Velocity · 7d
+0.6
★ / day
Trend
steady
star history

What it does

WikiSQL is a large annotated dataset for training and evaluating natural-language-to-SQL systems. It pairs English questions with SQL queries against structured tables, plus evaluation scripts and a maintained leaderboard. The repo contains the data in JSONL and SQLite formats, along with the original Seq2SQL baseline from Salesforce’s 2017 paper.

The interesting bit

The leaderboard splits cleanly between two regimes: models trained with gold logical forms, and “weakly supervised” models that learn from question-answer pairs alone. The gap between them has narrowed over time—TAPEX hits 89.5% test execution accuracy without logical forms—but execution-guided decoding remains the dominant trick for squeezing out points.

Key highlights

  • 80,654 hand-annotated examples across train/dev/test splits
  • Evaluation strictly enforces no table-content peeking at inference time
  • Maintained leaderboard with results from Salesforce, Microsoft, Alibaba, Ant Group, and others
  • Original tokenizer frozen in amber: deprecated Stanza dependency, Docker image provided for reproducibility
  • Data ships as both line-delimited JSON and SQLite databases

Caveats

  • Python 3 only; Python 2 support explicitly punted to “welcome a pull request”
  • Tokenizer dependency on deprecated CoreNLP wrapper; authors won’t migrate to current Stanza to preserve reproducibility
  • README truncates before fully describing the data schema

Verdict

Essential if you’re building or benchmarking text-to-SQL models. Skip it if you need a production natural language interface—this is a research dataset, not a drop-in query engine.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.