← all repositories

jankais3r/LLaMA_MPS

Enables LLaMA and Stanford-Alpaca model inference on Apple Silicon via MPS (Metal Performance Shaders) backend.

LLaMA_MPS
Velocity · 7d
+0.5
★ / day
Trend
steady
star history

This repository provides a Python-based setup for running LLaMA and Stanford-Alpaca language model inference on Apple Silicon GPUs. It leverages PyTorch’s MPS backend to utilize Metal for accelerated computation. The project includes scripts for model weight download, optional resharding for larger models (13B/30B/65B), and conversion for the Alpaca fine-tuned variant. Users can run inference in auto-complete or instruction-response (ChatGPT-like) modes.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.