← all repositories

mlc-ai/web-llm

A high-performance in-browser language model inference engine accelerated by WebGPU.

18.1k stars TypeScript Inference · ServingLanguage Models
web-llm
Velocity · 7d
+16
★ / day
Trend
steady
star history

WebLLM runs open-source LLMs directly in web browsers with hardware acceleration via WebGPU, requiring no server support. It provides OpenAI API compatibility for local inference, supporting features like streaming and JSON mode. The project enables privacy-preserving AI assistants while leveraging browser-based GPU acceleration, and serves as a companion to MLC LLM for universal model deployment.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.