kingyiusuen/image-to-latex
A PyTorch-based model that takes images of LaTeX math equations and generates corresponding LaTeX source code using a ResNet-18 encoder and Transformer decoder.

This repository implements an image-to-markup system that converts images of LaTeX formulas into editable LaTeX code. The model uses a ResNet-18 CNN encoder with 2D positional encoding to extract visual features from input images, which are then decoded by a Transformer to produce LaTeX markup. Trained on the arXiv LaTeX dataset of approximately 100K rendered math equations. The model has approximately 3 million parameters and is provided with a Streamlit web interface for end-to-end inference.