Publication Date
2009
Degree Type
Master's Project
Degree Name
Master of Science (MS)
Department
Computer Science
Abstract
In this paper, a novel technique for parallelizing data-classification problems is applied to finding genes in sequences of DNA. The technique involves various ensem- ble classification methods such as Bagging and Select Best. It then distributes the classifier training and prediction using MapReduce. A novel sequence classification voting algorithm is evaluated in the Bagging method, as well as compared against the Select Best method.
Recommended Citation
Jahnke, Glenn, "MRCRAIG: MapReduce and Ensemble Classifiers for Parallelizing Data Classification Problems" (2009). Master's Projects. 143.
DOI: https://doi.org/10.31979/etd.8fvj-43n5
https://scholarworks.sjsu.edu/etd_projects/143