← all repositories

karpathy/neuraltalk

Python+numpy implementation of multimodal recurrent neural networks that generate textual descriptions of images.

5.5k stars Python Computer VisionLanguage Models
neuraltalk
Velocity · 7d
+1.3
★ / day
Trend
steady
star history

This project implements neural network architectures for image captioning, combining CNNs for visual feature extraction with RNNs or LSTMs for sequential text generation. The models learn to predict sentence descriptions conditioned on input images and previous word context. It supports standard benchmark datasets including Flickr8K, Flickr30K, and MSCOCO.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.