zai-org/CogView
A 4-billion-parameter transformer model that generates images from text prompts, developed at Tsinghua University.

Velocity · 7d
+1.0
★ / day
Trend
→steady
star history
CogView is a pretrained transformer for text-to-image generation across general domains. It supports Chinese text input (with translation from other languages supported) and introduced training techniques like PB-relax and Sandwich-LN for stabilizing large deep transformer training. The project provides pretrained models, inference code, and a demo interface for generating images from textual descriptions.