Comprehensive Analysis of Large Language Models on CNN-based Deepfake Detection

Publication Date

3-9-2026

Document Type

Conference Proceeding

Publication Title

2026 International Conference on Computing Networking and Communications Icnc 2026

DOI

10.1109/ICNC68183.2026.11417000

First Page

90

Last Page

94

Abstract

Deepfake techniques are widely used for various purposes, including video augmentation; however, attackers exploit them for malicious goals such as fraud, scams, and phishing attacks. Although Convolutional Neural Network (CNN) is one of the most promising deep learning methods to detect deepfakes, it relies solely on visual feature information and lacks dynamic feedback. This paper proposes an effective multimodal deepfake detection system using Large Language Models (LLMs) and CNNs to identify deepfake video frames and improve detection rates. The proposed system combines visual and language modalities and generates different types of inputs by adding more realistic features and semantic context through prompt engineering with llMs. Our approach analyzes deepfake videos frame by frame, generates LLM responses using task-specific prompts, transforms these responses into embeddings, and integrates them with CNN-derived embeddings for evaluation through deep learning. Experimental results show that incorporating LLM features into CNN-based models improves overall accuracy by 6.2% to 15.4%. These findings highlight the potential of LLMs in deepfake detection and demonstrate the effectiveness of a multimodal approach for advancing digital forensics.

Keywords

Computer vision, Convolutional neural networks, Deep learning, Deepfake detection, Large language models

Department

Computer Engineering

Share

COinS