Publication Date
Spring 2024
Degree Type
Master's Project
Degree Name
Master of Science in Computer Science (MSCS)
Department
Computer Science
First Advisor
Nada Attar
Second Advisor
William Andreopoulos
Third Advisor
Dhanush Babu
Keywords
Large language models, gender bias
Abstract
Large language models (LLMs) play a significant role in modern human-computer interaction. They have exploded in popularity recently, becoming widely used for various tasks. However, concerns persist regarding potential biases within these models. This project investigates gender bias in the popular LLMs - GPT-3.5, GPT-4, Gemini, and LLAMA. The first part of our study focuses on analyzing biases using ambiguous sentences across three languages - English, Malayalam, and Tamil. We evaluate the LLMs to see if they associate occupations with commonly held gender stereotypes, by using specific professions within our test sentences. Through the use of two low-resource languages, this expands upon prior research conducted exclusively in English. We examine differences between the biases in the three languages in LLMs. In the second part of our study, we generate letters of evaluation for various professions, for both genders. We use GPT-3.5, Gemini and Llama to generate these letters. We then analyze the generated content for differences between the two languages, examining factors like word count, vocabulary count, lexical diversity, readability and lexical content. In addition, we generate personalities for good and bad employees and ask the LLMs to write letters of evaluation for these employees while assigning them a gender. Our findings for part one suggest that strong gender biases exist in all the LLMs in all three languages. In our results for part two, we find differences in the lexical content for males and females. The findings also suggest the LLMs assign the male gender to bad employees more often than to good employees. With the help of this study, we can better understand the biases in large language models and be better equipped to use AI in a way that mitigates bias.
Recommended Citation
Kumar, Athira, "EXPLORING GENDER BIAS IN LARGE LANGUAGE MODELS: CROSS-LINGUISTIC COMPARISONS AND EVALUATION LETTERS ANALYSIS" (2024). Master's Projects. 1411.
DOI: https://doi.org/10.31979/etd.whch-w6jc
https://scholarworks.sjsu.edu/etd_projects/1411