ShareGPT4Omni/ShareGPT4Video
A large multimodal model and dataset for improving video understanding and generation through better captioning, published at NeurIPS 2024.

Velocity · 7d
+1.5
★ / day
Trend
→steady
star history
ShareGPT4Video provides a dataset and 8B parameter model to enhance video understanding and text-to-video generation by using improved captions generated by GPT-4V. The project includes the ShareGPT4Video dataset on HuggingFace and the trained ShareGPT4Video-8B model. It aims to address caption quality issues in existing video datasets to improve both video comprehension and generation capabilities.