apple/ml-ane-transformers
Reference PyTorch implementation and Hugging Face integrations for optimized transformer deployment on Apple Neural Engine.

Velocity · 7d
+1.9
★ / day
Trend
→steady
star history
Provides a reference implementation and Hugging Face-compatible model classes optimized for running transformers on Apple Neural Engine (ANE) chips in A14/M1 and newer devices. Achieves up to 10x latency improvement and 14x lower peak memory by structuring computations to match ANE hardware constraints. Includes a tutorial for deploying DistilBERT models on iOS devices.