njustkmg/OMML
A multi-modal learning toolkit supporting joint and cross-modal learning algorithms for processing image-text data.

Velocity · 7d
+0.3
★ / day
Trend
→steady
star history
OMML is a multi-modal learning library based on PyTorch with PaddlePaddle compatibility, offering algorithm model implementations for multi-modal fusion, cross-modal retrieval, and image captioning tasks. The toolkit supports user-defined data and training pipelines, and has been applied to industrial use cases such as sneaker authenticity verification and rumor detection.