Publication Date


Degree Type

Master's Project

Degree Name

Master of Science (MS)


Computer Science


This thesis addresses the problem of predicting alternative splice sites in human genes. The most common way to identify alternative splice sites are the use of expressed sequence tags and microarray data. Since genes only produce alternative proteins under certain conditions, these methods are limited to detecting only alternative splice sites in genes whose alternative protein forms are expressed under the tested conditions. I have introduced three multiclass support vector machines that predict upstream and downstream alternative 3’ splice sites, upstream and downstream alternative 5’ splice sites, and the 3’ splice site of skipped and cryptic exons. On a test set extracted from the Alternative Splice Annotation Project database, I was able to correctly classify about 68% of the splice sites in the alternative 3’ set, about 62% of the splice sites in the alternative 5’ set, and about 66% in the exon skipping set.