← all repositories

kyegomez/Gemini

Open-source PyTorch implementation of Google's multi-modal foundation model Gemini supporting text, image, audio, and video inputs.

Gemini
Velocity · 7d
+0.5
★ / day
Trend
steady
star history

The repository implements Google’s Gemini model as an open-source project. It uses a transformer architecture that processes multiple modalities directly through special decoders for text or image generation. The model accepts text, audio, images, and video as input tokens processed by a transformer with conditional decoding for generation. Key features include Multi Grouped Query Attention, Flash Attention, RoPE, ALiBi, and KV cache support.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.