Is DeepAnalyze open source?

Yes — ruc-datalab/DeepAnalyze is open source, released under the MIT license.

What language is DeepAnalyze written in?

ruc-datalab/DeepAnalyze is primarily written in Python.

How popular is DeepAnalyze?

ruc-datalab/DeepAnalyze has 4.4k stars on GitHub.

Where can I find DeepAnalyze?

ruc-datalab/DeepAnalyze is on GitHub at https://github.com/ruc-datalab/DeepAnalyze.

← all repositories

ruc-datalab/DeepAnalyze

An 8B model that thinks it's a data science team

DeepAnalyze tries to automate the full data pipeline—cleaning, analysis, visualization, and report generation—without human hand-holding.

★4.4k stars Python Agents Domain Apps LLMOps · Eval

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does DeepAnalyze is an open-source 8B parameter model fine-tuned to act as an autonomous data-science agent. Feed it CSVs, Excel files, databases, JSON, or even plain text, and it attempts to run the full workflow: data prep, analysis, modeling, visualization, and final report generation. The project ships with a WebUI, a Jupyter integration, and a CLI, plus vLLM deployment scripts and quantized variants for GPUs with as little as 16GB of VRAM.

The interesting bit The authors trained on a 500K-sample dataset (also released) and explicitly target “open-ended data research” rather than single-task notebooks. The JupyterUI demo is particularly clever: the model outputs <Analyze>, <Code>, and <Execute> tags that get mapped directly to Markdown and executable cells, turning the LLM into a literal notebook author.

Key highlights

Fully open weights, training data, and inference code on Hugging Face
Quantized to 4-bit and 8-bit with FP8 KV cache; runs on consumer GPUs (16GB) up to datacenter A100s (80GB)
Multiple interfaces: browser-based WebUI (two versions), Jupyter Lab extension, and a Rich-based CLI in English or Chinese
Docker sandbox for code execution in the v2 WebUI
OpenAI-style API endpoint support added by community contributors

Caveats

The README notes the demo UIs are “initial versions” and invites further development
API keys for a hosted version require filling out a Google or Feishu form—no self-serve signup
Actual accuracy or benchmark comparisons against other data-science agents aren’t shown in the provided sources

Verdict Worth a look if you want a local, hackable alternative to closed AI data analysts. Skip it if you need proven enterprise reliability or don’t have the GPU budget to self-host.

Frequently asked

What is ruc-datalab/DeepAnalyze?: DeepAnalyze tries to automate the full data pipeline—cleaning, analysis, visualization, and report generation—without human hand-holding.
Is DeepAnalyze open source?: Yes — ruc-datalab/DeepAnalyze is open source, released under the MIT license.
What language is DeepAnalyze written in?: ruc-datalab/DeepAnalyze is primarily written in Python.
How popular is DeepAnalyze?: ruc-datalab/DeepAnalyze has 4.4k stars on GitHub.
Where can I find DeepAnalyze?: ruc-datalab/DeepAnalyze is on GitHub at https://github.com/ruc-datalab/DeepAnalyze.