Talk to your database like it's a person, get SQL back
A Python tool that parses natural language questions and turns them into executable SQL using only a database dump—no live connection required.

What it does ln2sql takes a SQL dump file and an English (or French, or anything you configure) question like “Count how many city there are with the name blob?” and outputs a valid SQL statement plus a structured JSON representation of the query. It learns the schema by parsing the dump, so it never needs to touch a running database.
The interesting bit The original research version used TreeTagger for POS tagging and linguistic smarts; this open rewrite trades that for genericity. You supply CSV language files and a thesaurus for synonyms—meaning you can theoretically support any language, but you also manually shoulder the burden of mapping “students” back to “student.” It’s a deliberate speed-versus-accuracy tradeoff the authors own up to.
Key highlights
- Supports SELECT, JOIN, WHERE, ORDER BY, GROUP BY, and aggregate functions (COUNT, SUM, AVG, MIN, MAX)
- Outputs both raw SQL and a JSON query structure for further processing
- Includes a basic GUI (
ln2sql_gui.py) alongside the CLI wrapper - Multi-threaded implementation with a grammar-based parser
- Language-agnostic via configurable keyword CSV files
Caveats
- Value detection and the BETWEEN operator are explicitly marked “not 100% efficient”
- No automatic lemmatization: plural/synonym mismatches require manual thesaurus entries
- The README notes this is “not the state-of-the-art tool for copyright reasons” and calls itself a “quick & dirty Python wrapper”
Verdict Worth a look if you need a lightweight, offline NL-to-SQL prototype or want to experiment with grammar-based parsing without database dependencies. Skip it if you need production-grade semantic understanding or automatic handling of word variations.