Publication Date

Spring 2013

Degree Type

Thesis

Degree Name

Master of Science (MS)

Department

Chemistry

Advisor

Brooke Lustig

Keywords

Query-Based qualitative predictors, Sequence Homology, Solvent Accessibility Prediction

Subject Areas

Chemistry

Abstract

Characterization of relative solvent accessibility (RSA) plays a major role in classifying a given protein residue as being on the surface or buried. This information is useful for studying protein structure and protein-protein interactions, and it is usually the first approach applied in the prediction of 3-dimensional (3D) protein structures.

Various complicated and time-consuming methods, such as machine learning, have been applied in solvent-accessibility predictions. In this thesis, we presented a simple application of linear regression methods using various sequence homology values for each residue as well as query residue qualitative predictors corresponding to each of the 20 amino acids. Initially, a fit was generated by applying linear regression to training sets with a variety of sequence homology parameters, including various sequence entropies and residue qualitative predictors. Then the coefficients generated via the training sets were applied to the test set, and, subsequently, the predicted RSA values were extracted for the test set. The qualitative predictors describe the actual query residue type (e.g., Gly) as opposed to the measures of sequence homology for the aligned subject residues. The prediction accuracies were calculated by comparing the predicted RSA values with NACCESS RSA (derived from X-ray crystallography). The utilization of qualitative predictors yielded significant prediction accuracy.

Share

COinS