Publication Date
Spring 2016
Degree Type
Master's Project
Degree Name
Master of Science (MS)
Department
Computer Science
First Advisor
Sami Khuri
Second Advisor
Chris Pollett
Third Advisor
Vidya Rangasayee
Keywords
Multiple Sequence Alignment Profile Hidden Markov Model
Abstract
The human genome consists of various patterns and sequences that are of biolog- ical signi cance. Capturing these patterns can help us in resolving various mysteries related to the genome, like how genomes evolve, how diseases occur due to genetic mutation, how viruses mutate to cause new disease and what is the cure for these diseases. All these applications are covered in the study of bioinformatics.
One of the very common tasks in bioinformatics involves simultaneous alignment of a number of biological sequences. In bioinformatics, this is widely known as Mul- tiple Sequence Alignment. Multiple sequence alignments help in grouping together organisms with the same evolutionary history. They also help in learning properties of a new sequence by aligning it with previously studied homologous sequences.
This project covers probabilistic modeling method to perform multiple sequence alignment (MSA). Use of hidden Markov models in MSA signi cantly improves com- putational speed especially for sequences that contain overlapping regions. We used Baum-Welch expectation maximization algorithm to train hidden Markov models and Viterbi algorithm to align the sequences. Our results are comparable to the ones obtained by publicly available packages like ClustalW and Clustal Omega.
Recommended Citation
Rakhonde, Shubhangi, "Multiple Sequence Alignment with Pro le Hidden Markov Models" (2016). Master's Projects. 495.
DOI: https://doi.org/10.31979/etd.vub9-j9hc
https://scholarworks.sjsu.edu/etd_projects/495