Publication Date

Fall 2021

Degree Type

Master's Project

Degree Name

Master of Science in Bioinformatics (MSBI)


Computer Science

First Advisor

Wendy Lee


Cancer is a complex disease which requires interactions between cell-intrinsic alterations and tumor microenvironment. The connection between epigenetics and genomic structure plays a key role in chromatin interaction which promotes enhancer-promoter interactions for transcriptional activities. Alterations of chromatin states in oncogenic signaling pathway potentially cause cancer cell-intrinsic changes and inappropriate instructions to normal cell cycles, leading to abnormal cell growth. Resulting phenotypic changes are correlated to underlying changes in higher-order chromatin structure such as topologically associating domains (TADs) and compartments. In cancer cells, TAD structure is usually altered to facilitate the communication between enhancers and promoters in addition to higher density of histone modification level, thus increasing transcriptional super-enhancer activities within certain boundary strengths. Strong insulation scores and boundaries indicate high boundary strength (boundary IV) which allows more intra-TAD interactions. High level of histone activating mark H3K27ac positioning near promoters increases transcriptional activity and gene expression. Therefore, spatial chromosomal structures by TADs and epigenetic markers are the key regulators of chromatin interactions in oncogenic activities from carcinogenesis to metastasis. The result indicates that XGBoost multi-class classifier has achieved the highest accuracy of 81.13% in classifying normal and cancer cell lines based on chromatin interactions, followed by Random Forest at 73.76% and TabNet classifier at 73.50%. The detection model could be further improved with high quality data sources and meaningful features for clinical applications in early-stage cancer detection and prognosis.