← all repositories

withcatai/node-llama-cpp

Node.js bindings enabling local LLM inference via llama.cpp with Metal, CUDA, and Vulkan GPU support.

2.1k stars TypeScript Inference · ServingLanguage Models
node-llama-cpp
Velocity · 7d
+2.0
★ / day
Trend
steady
star history

This library wraps llama.cpp to provide a complete Node.js interface for running large language models locally. It supports GPU acceleration across multiple backends, pre-built binaries for easy installation, and enforces structured output formats like JSON schemas during generation. The library includes embedding generation, function calling capabilities, and a CLI for chatting with models without writing code.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.