Inference · Serving

underdogs · picking up speed

lidge-jun/opencodex

+88% /wk +630 ★/day↗accelerating

It breaks the vendor lock on Codex and Claude Code by translating their API calls to any LLM backend you choose.

★ 5k TypeScript Coding Assistants · explained

PrismML-Eng/Bonsai-demo

+58% /wk +169 ★/day↗accelerating

A demo repo for running extreme-quantized language models locally without needing a research cluster.

★ 2.1k Shell Inference · Serving · explained

EvanZhouDev/openai-oauth

+53% /wk +74 ★/day↗accelerating

An unofficial proxy that borrows your ChatGPT/Codex OAuth tokens to serve a local OpenAI-compatible API, bypassing API credit billing.

★ 969 TypeScript Inference · Serving · explained

seakee/CPA-Manager-Plus

+37% /wk +119 ★/day↗accelerating

It exists to stop your AI gateway from quietly burning through quotas, cash, and expired OAuth tokens without leaving a paper trail.

★ 2.2k TypeScript LLMOps · Eval · explained

AtomicBot-ai/atomic-agent

+30% /wk +45 ★/day↗accelerating

It keeps the entire agent loop—prompts, tool calls, browser state, and memory—on your laptop so you don't have to rent a control plane in the cloud.

★ 1.1k TypeScript Agents · explained

routatic/proxy

+21% /wk +27 ★/day↗accelerating

A Go proxy that tricks Claude Code into using $5/month open models through OpenCode instead of Anthropic's API.

★ 892 Go Coding Assistants · explained

espressif/esp-claw

+18% /wk +48 ★/day↗accelerating

Espressif's C framework turns cheap microcontrollers into edge AI agents you program through IM chat.

★ 1.9k C Agents · explained

hero8152/Infinite-Canvas

+14% /wk +48 ★/day↗accelerating

One desktop UI that wires together ComfyUI, OpenAI, Gemini, ModelScope, and a dozen other generative APIs—plus some very opinionated legal terms.

★ 2.4k Python App Builders · explained

RyanCodrai/turbovec

+13% /wk +259 ★/day↗accelerating

turbovec exists so you can index embeddings immediately—no training, no tuning, no rebuilds—and search them faster than FAISS in a fraction of the RAM.

★ 14.4k Python RAG · Search · explained Feature

YGYOOO/WorldX

+12% /wk +21 ★/day↗accelerating

WorldX turns one sentence into a self-running simulation of AI agents who gossip, scheme, and remember grudges without a script.

★ 1.2k TypeScript Agents · explained

inference-labs-inc/subnet-2

+10% /wk +34 ★/day↗accelerating

A Bittensor subnet that uses zero-knowledge proofs to verify miners actually ran the AI models they claim to.

★ 2.4k Rust Inference · Serving · explained

Tencent/AngelSlim

+9.3% /wk +20 ★/day↗accelerating

AngelSlim integrates quantization, speculative decoding, and distillation so you can shrink and serve massive models from a single toolkit.

★ 1.5k Python Inference · Serving · explained

AtomicBot-ai/Atomic-Chat

+9.2% /wk +15 ★/day↗accelerating

It turns your local machine into an OpenAI-compatible inference endpoint so agents and IDEs can run on offline models without reconfiguration.

★ 1.2k TypeScript Inference · Serving · explained

kvcache-ai/ktransformers

+8.8% /wk +239 ★/day↗accelerating

KTransformers makes CPU-GPU heterogeneous inference and fine-tuning for massive MoE models almost practical on consumer hardware.

★ 19k Python Inference · Serving · explained

tuya/TuyaOpen

+7.9% /wk +20 ★/day↗accelerating

A C/C++ SDK that bundles speech recognition, multimodal AI, and cloud LLM plumbing so Wi-Fi modules and MCUs can behave like smart agents.

★ 1.8k C Agents · explained

lightningpixel/modly

+7.0% /wk +46 ★/day↗accelerating

It wraps open-source image-to-3D models in a desktop app so your snapshots never leave your GPU.

★ 4.6k TypeScript Image · Video · Audio · explained

lidge-jun/ima2-gen

+11% /wk +9.9 ★/day↗accelerating

It exists because cloud image generators deserve a local memory layer, a branching canvas, and a UI outside the chat thread.

★ 614 TypeScript Image · Video · Audio · explained

verl-project/verl-omni

+10% /wk +9.6 ★/day↗accelerating

It split off from `verl` to give diffusion, video, and omni-modality models an RL post-training framework that doesn't treat them like chatbots.

★ 661 Python ML Frameworks · explained

Bytez-com/docs

+5.6% /wk +18 ★/day↗accelerating

Bytez wraps 175,000+ AI models behind a single endpoint so you don't have to host them yourself.

★ 2.3k TypeScript Inference · Serving · explained

google-ai-edge/LiteRT

+6.4% /wk +30 ★/day↗accelerating

Google's edge ML runtime grows up, adds async NPU support, and finally admits PyTorch exists.

★ 3.2k C++ Inference · Serving · explained

loading more…