Author

Tien Nguyen

Publication Date

Spring 2024

Degree Type

Master's Project

Degree Name

Master of Science in Data Science (MSDS)

Department

Computer Science

First Advisor

Faranak Abri

Second Advisor

Nada Attar

Third Advisor

Fabio Di Troia

Keywords

Deception Detection, Natural Language Processing (NLP), Large Language Models (LLM), Audio Data Processing, Multimodal

Abstract

Recently, researchers have shown an increased interest in automatically detecting deceptive actions. The attention given to this area can be attributed to the many potential applications of deception detection, especially in the field of criminology. To contribute to the deception detection research, this project investigates textual and audio data extracted from spoken and written words. We evaluated and compared the traditional linguistic models with advanced Large Language Models (LLMs) while using Natural Language Processing (NLP) techniques. Additionally, various feature selection techniques were applied to assess the importance of linguistic features. We conducted extensive experiments to evaluate the effectiveness of both conventional and deep NLP models on textual data. In addition, the deep models were also applied to audio data. Findings suggest that the best-performing models for each data type are the Bidirectional Long Short Term Memory for textual data and the ResNet50 for audio data. These models were then combined to create a late fusion model that outperforms other text and audio models from previous research using the measures of accuracy and F1 score for comparison. This late fusion model achieved an impressive score with an accuracy of 90.9% and an F1 score of 91.07%.

Available for download on Sunday, June 01, 2025

Share

COinS