← all repositories

lucidrains/flamingo-pytorch

PyTorch implementation of DeepMind's Flamingo visual-language model with PerceiverResampler and GatedCrossAttentionBlock components.

1.3k stars Python Language ModelsML Frameworks
flamingo-pytorch
Velocity · 7d
+0.8
★ / day
Trend
steady
star history

This repository provides a PyTorch implementation of the Flamingo model from DeepMind, a state-of-the-art few-shot visual question answering system. It includes the perceiver resampler for shrinking media sequences, specialized masked cross-attention blocks for allowing language models to attend to visual inputs, and tanh gating at the ends of cross-attention and feedforward blocks. The implementation enables building multimodal language models that can process interleaved text and images for few-shot learning.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.