Author

Avinash More

Publication Date

Spring 2018

Degree Type

Master's Project

Department

Computer Science

Abstract

Many research papers in mathematics, computer science, and physics are written in LaTeX. Technical papers and articles in these areas often involve mathematical equations. Writing such equations in LaTeX takes longer than handwriting the same equations on paper. In this report, we want to show that the time-consuming process of typesetting LaTeX equations from images of these equations can be automated and optimized. Neural networks are good at solving related problems such as handwritten digit recognition, so we adapted these well-studied approaches to the LaTeX problem. Neural network model training involves large amounts of good quality data. So, for our project, we propose a convolutional neural network architecture to recognize LaTeX equations along with a way to generate labeled datasets of mathematical equation images and their corresponding LaTeX expressions. Our neural network model predicts from mathematical equations involving numbers, letters, mathematical symbols, and matrix images, the corresponding LaTeX for these equations. We have achieved an accuracy of more than 90% in predicting LaTeX for these complex equations involving up to 35 characters.

Share

COinS