Faculty Research, Scholarly, and Creative Activity

Unraveling the Enigma of Classification of Synthetic and Genuine Information Using Machine Learning and Explainable AI

Vishnu S. Pendyala, San Jose State UniversityFollow

Publication Date

6-5-2025

Document Type

Conference Proceeding

Publication Title

Communications in Computer and Information Science

Volume

2434 CCIS

DOI

10.1007/978-3-031-84602-1_16

First Page

229

Last Page

242

Abstract

There is a preponderance of AI-generated text everywhere today. Literature shows that there has been considerable success in detecting such text. This paper uses explainable AI (XAI) techniques to get insights into the workings of the machine learning models used to classify synthetic text. In detecting such text, this work analyzes synthetic and genuine information from visualization and explainability perspectives. The text is converted into vector embeddings using Robustly Optimized BERT Pretraining Approach (RoBERTa). Variational Autoencoders (VAEs) are used for visualization and Support Vector Machine (SVM) is used for classification. Local Interpretable Model-Agnostic Explanations (LIME), SHapley Additive exPlanations (SHAP), and Integrated Gradients are used to explain the classification. The experiments are done on two different types of datasets. Despite the machine learning model achieving outstanding accuracy similar to the previous work in the literature, it was determined that there is no clear explanation of why the representation learning or the classification works so outstandingly well. The explainability techniques used show that the model focuses on words that do not clearly indicate that the text is synthetically generated. Visualization in two dimensions shows that the vector embeddings of both classes of text overlap significantly and that there is no clear separation of the representations learned.

Keywords

Artificial Intelligence, Explainable AI, Intelligent systems, Knowledge representation, Natural Language Processing

Comments

This is a post-peer-review, pre-copy edit of a chapter published by Springer, Cham in Garg, D., Pendyala, V., Gupta, S.K., Najafzadeh, M. (eds) Advanced Computing. IACC 2024. Communications in Computer and Information Science, vol 2434. https://doi.org/10.1007/978-3-031-84602-1_16

Department

Applied Data Science

Recommended Citation

Vishnu S. Pendyala. "Unraveling the Enigma of Classification of Synthetic and Genuine Information Using Machine Learning and Explainable AI" Communications in Computer and Information Science (2025): 229-242. https://doi.org/10.1007/978-3-031-84602-1_16

Download

Available for download on Friday, June 05, 2026

Find in your library

COinS

Faculty Research, Scholarly, and Creative Activity

Unraveling the Enigma of Classification of Synthetic and Genuine Information Using Machine Learning and Explainable AI

Publication Date

Document Type

Publication Title

Volume

DOI

First Page

Last Page

Abstract

Keywords

Comments

Department

Recommended Citation

Search

Browse All

Links

Faculty Research, Scholarly, and Creative Activity

Unraveling the Enigma of Classification of Synthetic and Genuine Information Using Machine Learning and Explainable AI

Authors

Publication Date

Document Type

Publication Title

Volume

DOI

First Page

Last Page

Abstract

Keywords

Comments

Department

Recommended Citation

Share

Search

Browse All

Links