Language Models

big names · picking up speed

+275 ★/day↗accelerating

Kronos recasts noisy, multi-dimensional candlestick data as hierarchical discrete tokens so an autoregressive Transformer can forecast financial markets like a language model.

★ 34.2k Python Language Models · explained

jingyaogong/minimind

+76 ★/day↗accelerating

MiniMind is an educational training ground that rebuilds every stage of a modern language model—from tokenizer to RLHF—in raw PyTorch so you can see the gears turning instead of just calling high-level APIs.

★ 53.9k Python Language Models · explained

anthropics/claude-cookbooks

+152 ★/day↗accelerating

Official Jupyter notebooks demonstrating how to wire Claude into production tasks like RAG, SQL queries, and multimodal pipelines.

★ 50.2k Jupyter Notebook Learning · explained

microsoft/graphrag

+52 ★/day↗accelerating

GraphRAG exists to give LLMs a structured memory layer for reasoning over messy, private narrative text.

★ 34.9k Python RAG · Search · explained

chatanywhere/GPT_API_free

+34 ★/day↗accelerating

A hosted proxy that offers free, rate-limited API access to GPT, DeepSeek, and others for Chinese users who'd rather not tunnel through a VPN.

★ 39.1k Inference · Serving · explained

rasbt/LLMs-from-scratch

+72 ★/day↗accelerating

It teaches how LLMs work by implementing tokenization, attention, pretraining, and finetuning in pure PyTorch, one notebook at a time.

★ 99.9k Jupyter Notebook Language Models · explained

stanford-oval/storm

+27 ★/day↗accelerating

STORM simulates expert research conversations so LLMs can write long, cited articles from scratch.

★ 30.3k Python Agents · explained

huggingface/transformers

+38 ★/day↗accelerating

It centralizes model definitions so the same architecture works across PyTorch, JAX, vLLM, and llama.cpp without rewrites.

★ 163k Python Language Models · explained

openai/openai-cookbook

+23 ★/day↗accelerating

Official Python notebooks and guides for common OpenAI API tasks.

★ 74.9k Jupyter Notebook Learning · explained

mozilla-ai/llamafile

+9.3 ★/day↗accelerating

Mozilla wraps llama.cpp and a full model into a single cross-platform executable using an obscure libc trick.

★ 25.5k C++ Inference · Serving · explained

BerriAI/litellm

+104 ★/day↗accelerating

Because swapping from GPT-4o to Claude shouldn't require rewriting your request plumbing.

★ 54.8k Python LLMOps · Eval · explained

microsoft/BitNet

+8.1 ★/day↗accelerating

Microsoft built an inference engine that lets a single CPU run a 100B-parameter model at human reading speed by using 1.58-bit weights.

★ 39.8k C++ Inference · Serving · explained

mudler/LocalAI

+32 ★/day↗accelerating

LocalAI wraps 36+ inference engines behind one OpenAI-compatible API and pulls them on demand, so you can run LLMs, vision, voice, and video on anything from a CPU to a Jetson.

★ 47.9k Go Inference · Serving · explained

deepseek-ai/awesome-deepseek-integration

+16 ★/day↗accelerating

A curated directory of software, plugins, and frameworks that integrate with the DeepSeek API, maintained by DeepSeek itself.

★ 38.4k Learning · explained

fighting41love/funNLP

+25 ★/day↗accelerating

A maintainer cataloged every Chinese NLP repo they touched into a single, obsessively categorized list so others wouldn’t have to hunt.

★ 82.1k Python Learning · explained

datawhalechina/happy-llm

+27 ★/day↗accelerating

A systematic Chinese tutorial for developers who want to stop treating LLMs as black boxes and hand-build a 215-million-parameter model from the ground up.

★ 32.4k Jupyter Notebook Learning · explained

agentscope-ai/agentscope

+39 ★/day↗accelerating

A Python framework for building production multi-agent systems that leans on LLM reasoning instead of rigid prompt choreography.

★ 28.3k Python Agents · explained

karpathy/nanoGPT

+36 ★/day↗accelerating

A rewrite of minGPT that prioritizes working, hackable training code over educational scaffolding.

★ 61.6k Python Language Models · explained

deepseek-ai/DeepSeek-V3

+10 ★/day↗accelerating

DeepSeek-V3 exists to prove that a 671-billion-parameter model can train end-to-end without a single rollback, activate only 37B parameters per token, and still match leading closed-source systems.

★ 104k Python Language Models · explained

karpathy/llm.c

+9.1 ★/day↗accelerating

Because training a transformer shouldn't require 245MB of PyTorch just to multiply matrices.

★ 30.6k Cuda Language Models · explained

loading more…