Language Models

underdogs breaking out

PrismML-Eng/Bonsai-demo

+57% /wk +168 ★/day↗accelerating

A demo repo for running extreme-quantized language models locally without needing a research cluster.

★ 2k Shell Inference · Serving · explained

openJiuwen-ai/jiuwenswarm

+32% /wk +80 ★/day↗accelerating

It breaks complex tasks across a team of specialized LLM agents that refine their own skills as they work.

★ 1.8k Python Agents · explained

verl-project/verl-omni

+14% /wk +13 ★/day↗accelerating

It split off from `verl` to give diffusion, video, and omni-modality models an RL post-training framework that doesn't treat them like chatbots.

★ 651 Python ML Frameworks · explained

google-research/tabfm

+12% /wk +37 ★/day↘cooling

TabFM exists so you can run classification and regression on messy, mixed-type tables without retraining a model on your data.

★ 2.1k Python Language Models · explained

NVIDIA-NeMo/Automodel

+12% /wk +13 ★/day↗accelerating

NeMo AutoModel automates the busywork of wiring HuggingFace LLMs and VLMs into PyTorch-native distributed training so you can fine-tune or pretrain without hand-rolling parallelism code.

★ 770 Python ML Frameworks · explained

ximeiorg/Xime

+12% /wk +12 ★/day↗accelerating

Xime is a deliberately minimal, Rime-based Android input method that serves as its author's personal testbed for on-device AI experiments in predictive text and speech recognition.

★ 699 Kotlin Language Models · explained

techjarves/Uncensored-Local-Studio

+10% /wk +10 ★/day→steady

It unifies Stable Diffusion, GGUF chat, Whisper, and Kokoro TTS into a single offline desktop GUI so you can skip cloud APIs, subscriptions, and censorship filters.

★ 714 JavaScript Inference · Serving · explained

OpenDCAI/DataFlex

+8.9% /wk +22 ★/day↗accelerating

DataFlex stops LLM training loops from wasting compute on static data mixes by dynamically selecting, mixing, and reweighting samples inside LLaMA-Factory.

★ 1.7k Python Language Models · explained

AtomicBot-ai/Atomic-Chat

+8.5% /wk +14 ★/day↗accelerating

It turns your local machine into an OpenAI-compatible inference endpoint so agents and IDEs can run on offline models without reconfiguration.

★ 1.2k TypeScript Inference · Serving · explained

MarkPDFdown/markpdfdown

+8.3% /wk +23 ★/day↗accelerating

Uses multimodal LLMs to transcribe PDFs into Markdown, preserving complex layouts that traditional extractors mangle.

★ 1.9k Python Data Tooling · explained

xLLM-AI/xllm

+8.1% /wk +17 ★/day↗accelerating

xLLM is a C++ inference framework specifically optimized for Chinese AI accelerators, and it already powers JD.com’s core retail production workloads.

★ 1.5k C++ Inference · Serving · explained

ForceInjection/AI-fundamentals

+7.9% /wk +22 ★/day↗accelerating

Curated technical deep-dives covering everything from NVLink signal integrity to Kubernetes GPU scheduling and Huawei NPU porting.

★ 2k HTML Learning · explained

voocel/ainovel-cli

+7.7% /wk +17 ★/day↗accelerating

This Go CLI turns a single sentence into a full novel by making Architect, Writer, and Editor LLM agents plan, draft, and review inside a long-loop state machine—no human hand-holding required.

★ 1.5k Go Agents · explained

chenyme/grok2api

+7.1% /wk +68 ★/day↘cooling

Turns Grok's web interface into a standard API so your existing tools just work.

★ 6.8k Go Inference · Serving · explained

WenyuChiou/awesome-agentic-ai-zh

+6.8% /wk +47 ★/day↗accelerating

It turns the scattered firehose of agentic AI tools into an 8-stage curriculum with homework and realistic time budgets.

★ 4.8k Python Learning · explained

openinfer-project/openinfer

+6.5% /wk +5.4 ★/day↗accelerating

Built to prove that hand-written Rust kernels and no framework runtime can serve frontier models without the bloat.

★ 586 Rust Inference · Serving · explained

handy-computer/transcribe.cpp

+6.4% /wk +14 ★/day→steady

It exists to run a small army of speech-to-text models through a single GGUF-based ggml runtime that actually checks its math.

★ 1.6k C++ Inference · Serving · explained

Project-N-E-K-O/N.E.K.O

+6.1% /wk +19 ★/day↗accelerating

An AI companion platform that remembers, feels, and stares at your screen—now with a Steam release and a 1000-year SSL certificate.

★ 2.2k Python Agents · explained

youssofal/MTPLX

+5.6% /wk +8.7 ★/day↗accelerating

MTPLX squeezes extra tokens per second out of Apple Silicon by using the multi-token prediction heads that ship with modern models like Qwen 3.6, instead of leaving them idle like most runtimes.

★ 1.1k Python Inference · Serving · explained

pingcap/autoflow

+5.5% /wk +23 ★/day↗accelerating

AutoFlow turns documentation into a conversational knowledge graph, then lets you embed the chat window anywhere.

★ 3k TypeScript RAG · Search · explained

loading more…