Publication Date

Fall 2019

Degree Type

Master's Project

Degree Name

Master of Science (MS)

Department

Computer Science

First Advisor

Leonard Wesley

Second Advisor

Philip Heller

Third Advisor

Skyler Payne

Keywords

Next Generation Sequencing, Mutations, Homopolymer Regions, Pancreatic Cancer, Evidential Reasoning, Support Vector Classifier

Abstract

This study observes how an evidential reasoning approach can be used as a diagnostic tool for early detection of pancreatic cancer. The evidential reasoning model combines the output of a linear Support Vector Classifier (SVC) with factors such as smoking history, health history, biopsy location, NGS technology used, and more to predict the likelihood of the disease. The SVC was trained using genomic data of pancreatic cancer patients derived from the National Cancer Institute (NIH) Genomic Data Commons (GDC). To test the evidential reasoning model, a variety of synthetic data was compiled to test the impact of combinations of different factors. Through experimentation, we monitored how the evidential interval for pancreatic cancer fluctuated based on the inputs that were provided. We observed how the pancreatic cancer evidential interval increased and the machine learning prediction of pancreatic cancer was supported when the input changed from a non-smoker and non-drinker to an individual with a highly active smoking and drinking history. Similarly, we observed how the evidential interval for pancreatic cancer increased significantly when the machine learning prediction for pancreatic cancer was maintained as high and the input of the quality of the sequencing read was changed from a high quantity of cytosine guanine content and homopolymer regions to a moderate quantity of cytosine guanine content and low homopolymer regions; indicating that there was initially a higher likelihood of error in the sequencing reads, resulting in a more inaccurate machine learning output. This experiment shows that an evidence-based approach has the potential to contribute as a diagnostic tool for screening for high-risk groups. Future work should focus on improving the machine learning model by using a larger pancreatic cancer genomic database. Next steps will involve programmatically analyzing real sequencing reads for irregular guanine cytosine content and high homopolymer regions.

Recommended Citation

Sharagi, Omid, "Toward Early Detection Of Pancreatic Cancer: An Evidence-Based Approach" (2019). Master's Projects. 896.
DOI: https://doi.org/10.31979/etd.2tzh-x2j9
https://scholarworks.sjsu.edu/etd_projects/896

Download

Included in

Artificial Intelligence and Robotics Commons, Other Computer Sciences Commons

COinS

DOI

https://doi.org/10.31979/etd.2tzh-x2j9

Master's Projects

Toward Early Detection Of Pancreatic Cancer: An Evidence-Based Approach

Publication Date

Degree Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Keywords

Abstract

Recommended Citation

Included in

DOI

Search

Browse All

Links

Master's Projects

Toward Early Detection Of Pancreatic Cancer: An Evidence-Based Approach

Author

Publication Date

Degree Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Keywords

Abstract

Recommended Citation

Included in

Share

DOI

Search

Browse All

Links