← all repositories
FerreroJeremy/ln2sql

Talk to your database like it's a person, get SQL back

A Python tool that parses natural language questions and turns them into executable SQL using only a database dump—no live connection required.

521 stars Python Language ModelsData Tooling
ln2sql
Velocity · 7d
+0.1
★ / day
Trend
steady
star history

What it does ln2sql takes a SQL dump file and an English (or French, or anything you configure) question like “Count how many city there are with the name blob?” and outputs a valid SQL statement plus a structured JSON representation of the query. It learns the schema by parsing the dump, so it never needs to touch a running database.

The interesting bit The original research version used TreeTagger for POS tagging and linguistic smarts; this open rewrite trades that for genericity. You supply CSV language files and a thesaurus for synonyms—meaning you can theoretically support any language, but you also manually shoulder the burden of mapping “students” back to “student.” It’s a deliberate speed-versus-accuracy tradeoff the authors own up to.

Key highlights

  • Supports SELECT, JOIN, WHERE, ORDER BY, GROUP BY, and aggregate functions (COUNT, SUM, AVG, MIN, MAX)
  • Outputs both raw SQL and a JSON query structure for further processing
  • Includes a basic GUI (ln2sql_gui.py) alongside the CLI wrapper
  • Multi-threaded implementation with a grammar-based parser
  • Language-agnostic via configurable keyword CSV files

Caveats

  • Value detection and the BETWEEN operator are explicitly marked “not 100% efficient”
  • No automatic lemmatization: plural/synonym mismatches require manual thesaurus entries
  • The README notes this is “not the state-of-the-art tool for copyright reasons” and calls itself a “quick & dirty Python wrapper”

Verdict Worth a look if you need a lightweight, offline NL-to-SQL prototype or want to experiment with grammar-based parsing without database dependencies. Skip it if you need production-grade semantic understanding or automatic handling of word variations.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.