Faculty Research, Scholarly, and Creative Activity

The Accuracy of Domain Specific and Descriptive Analysis Generated by Large Language Models

Denish Omondi Otieno, Texas Tech University
Faranak Abri, San Jose State UniversityFollow
Sima Siami-Namini, Johns Hopkins University
Akbar Siami Namin, Texas Tech University

Publication Date

1-1-2024

Document Type

Conference Proceeding

Publication Title

Proceedings - 2024 IEEE 48th Annual Computers, Software, and Applications Conference, COMPSAC 2024

DOI

10.1109/COMPSAC61105.2024.00274

First Page

1739

Last Page

1746

Abstract

Large language models (LLMs) have attracted considerable attention as they are capable of showcasing impressive capabilities generating comparable high-quality responses to human inputs. LLMs, can not only compose textual scripts such as emails and essays but also executable programming code. Contrary, the automated reasoning capability of these LLMs in performing statistically-driven descriptive analysis, particularly on user-specific data and as personal assistants to users with limited background knowledge in an application domain who would like to carry out basic, as well as advanced statistical and domain-specific analysis is not yet fully explored. More importantly, the performance of these LLMs has not been compared and discussed in detail when domain-specific data analysis tasks are needed. Additionally, the use of LLMs in isolation is often at times insufficient for creating powerful applications and the real potential comes when LLMs are combined with other sources of computation such as LangChain. This study, consequently, explores whether LLMs can be used as generative AI-based personal assistants to users with minimal background knowledge in an application domain infer key data insights. To demonstrate the performance of the LLMs, the study reports a case study through which descriptive statistical analysis, as well as Natural Language Processing (NLP) based investigations, are performed on a number of phishing emails with the objective of comparing the accuracy of the results generated by LLMs to the ones produced by analysts. The experimental results show that LangChain and the Generative Pre-trained Transformer (GPT-4) excel in numerical reasoning tasks i.e., temporal statistical analysis, achieve competitive correlation with human judgments on feature engineering tasks while struggle to some extent on domain specific knowledge reasoning, where domain-specific knowledge is required.

Funding Number

2319802

Funding Sponsor

National Science Foundation

Keywords

Generative Pre-trained Transformer, LangChain, Large Language Models, Natural Language Processing, Phishing Emails

Department

Computer Science

Recommended Citation

Denish Omondi Otieno, Faranak Abri, Sima Siami-Namini, and Akbar Siami Namin. "The Accuracy of Domain Specific and Descriptive Analysis Generated by Large Language Models" Proceedings - 2024 IEEE 48th Annual Computers, Software, and Applications Conference, COMPSAC 2024 (2024): 1739-1746. https://doi.org/10.1109/COMPSAC61105.2024.00274

Link to Full Text

COinS

Faculty Research, Scholarly, and Creative Activity

The Accuracy of Domain Specific and Descriptive Analysis Generated by Large Language Models

Publication Date

Document Type

Publication Title

DOI

First Page

Last Page

Abstract

Funding Number

Funding Sponsor

Keywords

Department

Recommended Citation

Search

Browse All

Links

Faculty Research, Scholarly, and Creative Activity

The Accuracy of Domain Specific and Descriptive Analysis Generated by Large Language Models

Authors

Publication Date

Document Type

Publication Title

DOI

First Page

Last Page

Abstract

Funding Number

Funding Sponsor

Keywords

Department

Recommended Citation

Share

Search

Browse All

Links