Publication Date

Spring 2010

Degree Type


Degree Name

Master of Science (MS)






Homology-based, non-strongly hydrophobic residues, packing density, protein surface accessibility prediction, small residues, strongly hydrophobic core

Subject Areas

Biology, Bioinformatics; Chemistry, Biochemistry


Residues present on the surface of the proteins are involved in a number of functions, especially in ligand-protein interactions, that are important for drug design. The residues present in the core of the protein provide stability to the protein and help in maintaining protein structure. Hence, there is a need for a binary characterization of protein residues based on their surface accessibility (surface accessible or buried). Such a classification can aid in the directed study of either residue type.

A number of methods for the prediction of surface accessible protein residues have been proposed in the past. However, most of these methods are computationally complex and time consuming. In this thesis, we propose a simple method based on protein sequence homology parameters for the binary classification of protein residues

as surface accessible or “buried”. To aid in the classification of protein residues, we chose three highly conservative homology-based parameter filter thresholds. The filter thresholds predicted and evaluated are: residue sequence entropy ≥0:15, fraction of strongly hydrophobic residues <0:5 and fraction of small residues < 0:15. The application of these filter thresholds to the residues, is expected to predict the “buried residues” with a better percentage accuracy than that of the surface accessible residues.

These filter thresholds were selected from the frequency distributions and the aggregate correlation plots of the various homology-based parameters. An analysis of the plots suggests the presence of a strongly hydrophobic core between packing density 14 –22 where the presence of strongly hydrophobic residues is maximum and the presence of small and non-strongly hydrophobic residues is minimum. However, the densest portion of the protein (density 26 – 35) is indicated to be occupied by a combination of small and non-strongly hydrophobic residues with a negligible presence of strongly hydrophobic residues.