← all repositories

abhshkdz/neural-vqa

A Torch implementation of a neural visual question answering model combining CNN image features with LSTM language processing.

neural-vqa
Velocity · 7d
+0.1
★ / day
Trend
steady
star history

This repository implements the VIS+LSTM visual question answering model from a research paper by Ren, Kiros & Zemel. It uses VGG-19 CNN to extract image features and LSTM to process questions, combining both to generate answers about images. The model is trained on MSCOCO images paired with VQA dataset question-answer pairs.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.