Off-campus SJSU users: To download campus access theses, please use the following link to log into our proxy server with your SJSU library user name and PIN.
Publication Date
Spring 2021
Degree Type
Thesis - Campus Access Only
Degree Name
Master of Science (MS)
Department
Computer Engineering
Advisor
Carlos Rojas
Keywords
Autoencoders, Comparing Hi-C Data, Deep Learning, Denoising, Hi-C Data, Synthetic Hi-C Data
Subject Areas
Computer engineering; Computer science; Bioinformatics
Abstract
The rapidly increasing three-dimensional genome-wide data produced by chromosome conformation capture presents many challenges in computational biology to understand the genome. We use methods such as the high-throughput chromosome conformation capture (Hi-C) technique to understand the role that three-dimensional organizational structures play in gene expression. In recent years, the field has learned about spatial structures such as A/B compartments, topological associating domains (TADs), and chromatin loops. By studying cell lines exposed to various biological conditions we can understand the role of 3D structure. However, the sequencing process of Hi-C data produces noise that prevents the effective comparison of cell lines. Methods such as distance centric and linear models help identify differences between pairs of Hi-C data, but they do not consider the noise that is introduced during sequencing. As a result, these methods have their results biased by noise. We propose a novel method that helps detect areas of interest between pairs of Hi-C data using convolutional autoencoders that reduces the noise in Hi-C data. The proposed deep learning framework can compare diseased and normal genomes of two different cell types. Our method reduces noise that could alter the comparison of Hi-C data. By analyzing various similarity measures our preliminary experiments provide evidence for the advantage of using a convolutional autoencoder for Hi-C comparisons.
Recommended Citation
Krishnamurthy, Sughosh, "A Deep Learning Method For Comparing Hi-C Data" (2021). Master's Theses. 5183.
DOI: https://doi.org/10.31979/etd.7qqa-x3up
https://scholarworks.sjsu.edu/etd_theses/5183