Publication Date

Spring 2023

Degree Type

Master's Project

Degree Name

Master of Science (MS)


Computer Science

First Advisor

Chris Pollett

Second Advisor

William Andreopoulos

Third Advisor

Thomas Austin


Hierarchical attention model (HAN), machine translation (MT), neural machine translation (NMT)


Machine translation (MT) aims to translate texts with minimal human involvement, and the utilization of machine learning methods is pivotal to its success. Sentence-level and paragraph-level translations were well-explored in the past decade, such as the Transformer and its variations, but less research was done on the document level. From reading a piece of news in a different language to trying to understand foreign research, document-level translation can be helpful.

This project utilizes a hierarchical attention (HAN) mechanism to abstract context information making document-level translation possible. It further utilizes the Big Bird attention mask in the hope of reducing memory usage. The results from the experiments showed that the HAN models produced readable translations and had an average BLEU score of 0.75 (0.67 for full attention HAN, and 0.82 for Big Bird attention), whereas the Transformer model failed to comprehend the large input and had a score of 0.22 on the same dataset.