Is CrimeKgAssitant open source?

Yes — liuhuanyong/CrimeKgAssitant is an open-source project tracked on heatdrop.

What language is CrimeKgAssitant written in?

liuhuanyong/CrimeKgAssitant is primarily written in Python.

How popular is CrimeKgAssitant?

liuhuanyong/CrimeKgAssitant has 1.6k stars on GitHub.

Where can I find CrimeKgAssitant?

liuhuanyong/CrimeKgAssitant is on GitHub at https://github.com/liuhuanyong/CrimeKgAssitant.

← all repositories

liuhuanyong/CrimeKgAssitant

A Chinese legal AI that predicts crimes from case descriptions

Trained on 2.88 million court records, it classifies charges, sorts legal questions, and answers them—sometimes with alarming confidence.

★1.6k stars Python Domain Apps Language Models RAG · Search

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does CrimeKgAssitant is a Chinese legal NLP toolkit with three main jobs: predict which of 202 possible crimes matches a case description, classify legal questions into 13 categories (marriage, labor, traffic, etc.), and answer those questions by retrieving similar past responses. It also includes an 856-concept knowledge graph of criminal charges.

The interesting bit The project achieves ~92% accuracy on charge prediction using nothing fancier than doc embeddings plus SVM—no neural nets, just 2.88 million training examples and 12 hours of training. The QA system, meanwhile, is endearingly blunt: ask about selling contraband and it replies “没什么” (“nothing”); ask about finding a girlfriend and it routes you to the police.

Key highlights

2.88M case records for 202-class charge prediction (SVM, 92% accuracy)
200K legal QA pairs for 13-category question classification (CNN hits 95.9% test accuracy; LSTM lags at 71.7%)
856-concept crime knowledge graph for structured queries
Retrieval-based QA that returns actual past answers, not generated text
All training data and dictionaries included in the repo

Caveats

The README is entirely in Chinese; code comments and variable names follow suit
QA quality varies wildly—some answers are detailed legal procedures, others are comically terse or off-topic
No model weights or pre-trained embeddings are provided; you train from scratch
The “knowledge graph” appears to be a concept list rather than a queryable graph structure in the released code

Verdict Worth a look if you’re building Chinese legal NLP or need a baseline for charge classification. Skip it if you need production-ready legal advice or English-language support—the “assistant” part is aspirational.

Frequently asked

What is liuhuanyong/CrimeKgAssitant?: Trained on 2.88 million court records, it classifies charges, sorts legal questions, and answers them—sometimes with alarming confidence.
Is CrimeKgAssitant open source?: Yes — liuhuanyong/CrimeKgAssitant is an open-source project tracked on heatdrop.
What language is CrimeKgAssitant written in?: liuhuanyong/CrimeKgAssitant is primarily written in Python.
How popular is CrimeKgAssitant?: liuhuanyong/CrimeKgAssitant has 1.6k stars on GitHub.
Where can I find CrimeKgAssitant?: liuhuanyong/CrimeKgAssitant is on GitHub at https://github.com/liuhuanyong/CrimeKgAssitant.