← all repositories

0hq/WebGPT

A vanilla JavaScript implementation of GPT transformer inference running in web browsers via WebGPU.

3.8k stars JavaScript Inference · ServingLanguage Models
WebGPT
Velocity · 7d
+3.3
★ / day
Trend
steady
star history

WebGPT runs GPT language models directly in web browsers using WebGPU compute shaders for near-native GPU performance. It implements the full transformer architecture including embeddings, multi-head attention, and feedforward layers entirely in vanilla JS. The project has been tested with models up to 1.5B parameters, reporting benchmark timings (ms/token) for models ranging from 5M to 1.5B parameters on Apple M1 hardware. It includes pre-converted GPT-2 117M and toy Shakespeare models, with scripts provided for importing custom models.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.