← all repositories

pytorch/torchchat

A PyTorch-native toolkit for running LLMs locally on servers, desktops, and mobile devices with quantization and multiple deployment options.

torchchat
Velocity · 7d
+4.5
★ / day
Trend
steady
star history

torchchat is a codebase demonstrating how to run large language models locally using PyTorch. It supports running models via Python (eager and compiled modes), AOT Inductor for optimized server/desktop execution, and ExecuTorch for mobile deployment on iOS and Android. The project supports popular LLMs including Llama 3, Llama 2, Mistral, and DeepSeek R1, with multimodal capabilities for models like Llama 3.2 11B, and offers multiple quantization schemes for memory-efficient inference.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.