← all repositories

ByteVisionLab/TokenFlow

Unified image tokenizer with dual-codebook architecture for multimodal understanding and text-to-image generation.

TokenFlow
Velocity · 7d
+0.8
★ / day
Trend
steady
star history

TokenFlow bridges multimodal understanding and generation using an innovative dual-codebook architecture that decouples semantic and pixel-level feature learning while maintaining alignment through a shared mapping mechanism. It surpasses flagship models like LLaVA-1.5 and EMU3 on visual comprehension tasks and achieves performance comparable to SDXL on text-to-image generation at 256×256 resolution.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.