Publication Date
Spring 2024
Degree Type
Master's Project
Degree Name
Master of Science in Data Science (MSDS)
Department
Computer Science
First Advisor
Faranak Abri
Second Advisor
Nada Attar
Third Advisor
Fabio Di Troia
Keywords
Deception Detection, Natural Language Processing (NLP), Large Language Models (LLM), Audio Data Processing, Multimodal
Abstract
Recently, researchers have shown an increased interest in automatically detecting deceptive actions. The attention given to this area can be attributed to the many potential applications of deception detection, especially in the field of criminology. To contribute to the deception detection research, this project investigates textual and audio data extracted from spoken and written words. We evaluated and compared the traditional linguistic models with advanced Large Language Models (LLMs) while using Natural Language Processing (NLP) techniques. Additionally, various feature selection techniques were applied to assess the importance of linguistic features. We conducted extensive experiments to evaluate the effectiveness of both conventional and deep NLP models on textual data. In addition, the deep models were also applied to audio data. Findings suggest that the best-performing models for each data type are the Bidirectional Long Short Term Memory for textual data and the ResNet50 for audio data. These models were then combined to create a late fusion model that outperforms other text and audio models from previous research using the measures of accuracy and F1 score for comparison. This late fusion model achieved an impressive score with an accuracy of 90.9% and an F1 score of 91.07%.
Recommended Citation
Nguyen, Tien, "Deception Detection Models from Speech" (2024). Master's Projects. 1415.
DOI: https://doi.org/10.31979/etd.6ah3-mev7
https://scholarworks.sjsu.edu/etd_projects/1415