Is sketch open source?

Yes — approximatelabs/sketch is open source, released under the MIT license.

What language is sketch written in?

approximatelabs/sketch is primarily written in Python.

How popular is sketch?

approximatelabs/sketch has 2.3k stars on GitHub.

Where can I find sketch?

approximatelabs/sketch is on GitHub at https://github.com/approximatelabs/sketch.

← all repositories

approximatelabs/sketch

Pandas copilot that actually reads your data first

Sketch feeds column summaries into LLMs so its code suggestions know what they're working with.

★2.3k stars Python Coding Assistants LLMOps · Eval

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does

Sketch is a Python library that bolts a .sketch accessor onto any pandas DataFrame. You can ask natural-language questions about your data (df.sketch.ask), request generated code snippets (df.sketch.howto), or even run LLM-powered transformations row-by-row (df.sketch.apply). No IDE plugin required—just import sketch and go.

The interesting bit

The hook is in the name: “sketch” refers to data sketches, the approximation algorithms that summarize your columns cheaply. Rather than dumping the whole DataFrame into the prompt (expensive, slow, privacy nightmare), Sketch compresses the schema and statistics into context the LLM can actually use. It’s a pragmatic compression layer between your data and a language model that otherwise works blind.

Key highlights

Three modes: ask for exploration, howto for code generation, apply for data generation/transforms
Runs against a hosted endpoint by default (prompts.approx.dev) for zero-config startup
Can switch to local Hugging Face models (MPT-7B, StarCoder) or your own OpenAI key via environment variables
Built on the team’s own lambdaprompt library for templated LLM calls
Explicitly targets the “glue work” of data cleaning, feature extraction, and compliance masking

Caveats

The apply mode requires an OpenAI API key; the free hosted endpoint won’t cover everything
Local model setup involves three environment variables and downloading weights—“usable in seconds” really means “usable in seconds if you use their cloud endpoint”
The README’s future hope of “custom made data + language foundation models” is just that: future hope

Verdict

Worth a spin if you live in pandas and want quick, context-aware code stubs without leaving your notebook. Skip it if you need deterministic, auditable data pipelines—this is exploratory acceleration, not production infrastructure.

Frequently asked

What is approximatelabs/sketch?: Sketch feeds column summaries into LLMs so its code suggestions know what they're working with.
Is sketch open source?: Yes — approximatelabs/sketch is open source, released under the MIT license.
What language is sketch written in?: approximatelabs/sketch is primarily written in Python.
How popular is sketch?: approximatelabs/sketch has 2.3k stars on GitHub.
Where can I find sketch?: approximatelabs/sketch is on GitHub at https://github.com/approximatelabs/sketch.