← all repositories

PrismML-Eng/Bonsai-demo

Demo repository for running Bonsai 1-bit and Ternary-Bonsai 1.58-bit language models locally across CPU, Metal, CUDA, Vulkan, and ROCm backends.

Bonsai-demo
Velocity · 7d
+11
★ / day
Trend
steady
star history

This repository provides instructions and scripts to run Bonsai quantized language models locally on Mac (Metal), Linux/Windows (CUDA, Vulkan, ROCm), or CPU. It includes pre-built llama.cpp binaries and MLX (Apple Silicon) forks with support for Q1_0 and Q1_58 quantization formats. Model weights are distributed via HuggingFace collections, with companion web demos and Google Colab notebooks for easy experimentation.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.