wangzhaode/mnn-llm
A C++ LLM deployment framework built on Alibaba's MNN inference engine supporting multiple open-source models.

Velocity · 7d
+1.4
★ / day
Trend
→steady
star history
This project provides inference runtime for large language models using MNN, a deep learning framework optimized for cross-platform deployment. It supports deploying models including ChatGLM-6B, Baichuan2-7B, Qwen-7B, and CodeGeeX2 on mobile devices (Android/iOS) and desktops (Windows/macOS/Linux) with GPU acceleration via CUDA and OpenCL. The repository includes CLI demos, web demos, Android Studio projects, and Python bindings.