guinmoon/LLMFarm
iOS and MacOS app for running various LLMs locally offline.

Velocity · 7d
+1.9
★ / day
Trend
→steady
star history
LLMFarm is a native iOS and MacOS application that enables local execution of large language models including LLaMA, Gemma, GPT-NeoX, RWKV, and StarCoder. It is built on top of the GGML library and llama.cpp for efficient inference, utilizing Metal for GPU acceleration on Apple silicon. The app supports various sampling methods, model parameter configuration, context state saving and restoration, and includes RAG capabilities for retrieval-augmented generation.