Author

Publication Date

Spring 2026

Degree Type

Master's Project

Degree Name

Master of Science in Bioinformatics (MSBI)

Department

Computer Science

First Advisor

Dr. Wendy Lee

Second Advisor

Dr. Fabia Di Troia

Third Advisor

Dr. William Andreopoulos

Keywords

cfDNA, Fragmentomics, exon 1, Shannon entropy

Abstract

Cell-free DNA (cfDNA) fragmentomics has emerged as a promising non-invasive approach for detecting and characterizing cancer by analyzing DNA fragmentation patterns. These fragment patterns are unique and can be used to distinguish between tumor and healthy DNA. This paper uses machine learning models to detect cancer by evaluating Shannon entropy of fragment lengths at the first coding exon 1 regions and correlating them to targeted cancer genes. This approach was applied to two datasets consisting of samples with prostate, breast, and lung cancers. Our results indicated that exon 1 fragmentation entropy captures biologically relevant differences in cancer-specific chromatin organization and can provide molecular insights into cancer-specific genes. The model achieved an area under the curve (AUC) greater than 0.79 across all cancer types. This indicates a good classification performance; however, performance can be further improved by combining additional fragmentation signals, such as cfDNA methylation, end-motifs, and copy-number profiles, and increasing training and validation datasets. Further evaluation of fragmentation features is required to understand their potential for early cancer detection using liquid biopsy approaches.

Share

COinS