Publication Date
Fall 2019
Degree Type
Master's Project
Degree Name
Master of Science (MS)
Department
Computer Science
First Advisor
Chris Pollett
Second Advisor
Robert Chun
Third Advisor
Kong Li
Keywords
Image Localization, CNNs, RNNs, Neural Nets, Captioning, OCR
Abstract
Image localization corresponds to translating the text present in the images from one language to other language. The aim of the project is to develop a methodology to translate the text in image captions from English to Hindi by taking context of the images into account. A lot of work has been done in this field [22], but our aim was to explore if the accuracy can be further improved by consideration of the additional information imparted by the images apart from the text. We have explored Deep Learning using neural networks for this project. In particular, Recurrent Neural Networks (RNN) have been used which are ideal for sequence translations and would meet the needs of this project which involves text sequences. This technique of image localization would be beneficial in a lot of fields. For example, in order to make the text data accessible to everyone, text data should be translated in multiple languages spoken by people across the world. This will help in the growth at the rural areas and countries where English is not spoken by giving them access to data in their local languages. This could also benefit tourists who would then be able to understand the sign boards and posters in a foreign country. With accurate data translation, the old manuscripts can also be translated to English upon which further research can be carried out.
Recommended Citation
Gupta, Riti, "Image-Based Localization of User-Interfaces" (2019). Master's Projects. 892.
DOI: https://doi.org/10.31979/etd.czzq-74qs
https://scholarworks.sjsu.edu/etd_projects/892