Is gemma-tuner-multimodal open source?

Yes — mattmireles/gemma-tuner-multimodal is open source, released under the MIT license.

What language is gemma-tuner-multimodal written in?

mattmireles/gemma-tuner-multimodal is primarily written in Python.

How popular is gemma-tuner-multimodal?

mattmireles/gemma-tuner-multimodal has 1.5k stars on GitHub.

Where can I find gemma-tuner-multimodal?

mattmireles/gemma-tuner-multimodal is on GitHub at https://github.com/mattmireles/gemma-tuner-multimodal.

← all repositories

mattmireles/gemma-tuner-multimodal

Fine-tune Gemma on your MacBook, no H100 required

A multimodal LoRA trainer that runs on Apple Silicon and streams training data from the cloud so your SSD doesn't drown.

★1.5k stars Python Language Models ML Frameworks

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does This is a PyTorch-based fine-tuning pipeline for Google’s Gemma 4 and Gemma 3n models, built specifically for Apple Silicon. It handles text, image, and audio modalities through LoRA adapters, using Metal Performance Shaders instead of CUDA. The dataloader can stream shards from GCS or BigQuery, which means you can train on terabyte-scale datasets without copying everything to a laptop SSD.

The interesting bit The built-in training visualizer runs in your browser and shows loss curves, attention heatmaps, gradient signal strength, memory pressure, and token predictions — all updating live, no TensorBoard or notebook required. The README claims setup takes 30 seconds. The wizard CLI walks through model selection, dataset pairing, and hyperparameters, then spawns training with a single command.

Key highlights

Supports text-only, image+text, and audio+text fine-tuning via CSV datasets (local for image/text; streaming available for all modalities)
Targets Gemma 3n E2B/E4B and Gemma 4 E2B/E4B checkpoints through Hugging Face + PEFT LoRA
MPS-native with fallback to CUDA or CPU; explicitly designed for Macs without NVIDIA GPUs
Hierarchical INI configuration with profiles, plus a system-check command to surface environment issues before training fails
Ships a 16-row sample dataset for sub-minute end-to-end pipeline verification

Caveats

Gemma 4’s larger weights (26B/31B class) use a different Transformers architecture and are not yet supported for training
Some non-training commands (gemma_generate, ASR eval, multimodal probing) still reject Gemma 4 IDs pending code updates
Image fine-tuning is local CSV only in v1; audio+text is the standout modality versus competitors

Verdict Worth a look if you’re on Apple Silicon and need to fine-tune Gemma on proprietary audio or vision data without renting cloud GPUs. Skip it if you’re on Linux/NVIDIA (Unsloth or axolotl will be faster) or if you need the larger Gemma 4 variants that aren’t wired up yet.

Frequently asked

What is mattmireles/gemma-tuner-multimodal?: A multimodal LoRA trainer that runs on Apple Silicon and streams training data from the cloud so your SSD doesn't drown.
Is gemma-tuner-multimodal open source?: Yes — mattmireles/gemma-tuner-multimodal is open source, released under the MIT license.
What language is gemma-tuner-multimodal written in?: mattmireles/gemma-tuner-multimodal is primarily written in Python.
How popular is gemma-tuner-multimodal?: mattmireles/gemma-tuner-multimodal has 1.5k stars on GitHub.
Where can I find gemma-tuner-multimodal?: mattmireles/gemma-tuner-multimodal is on GitHub at https://github.com/mattmireles/gemma-tuner-multimodal.