google-research/big_vision
A Google Research codebase for training large-scale vision models at scale using Cloud TPU VMs or GPU machines.

Velocity · 7d
+2.3
★ / day
Trend
→steady
star history
big_vision provides infrastructure for training vision models including Vision Transformers, SigLIP, MLP-Mixer, and LiT. The codebase uses Jax/Flax libraries and TensorFlow data pipelines to enable large-scale distributed training across up to 2048 TPU cores. It serves as both a publication platform for Google research projects and a starting point for large-scale vision experiments.