Publication Date

Summer 2011

Degree Type

Master's Project

Degree Name

Master of Science (MS)


Computer Science

First Advisor

T. Y. Lin

Second Advisor

Robert Chun

Third Advisor

Soon Tee Teoh


Learning Algorithms Automata


The purpose of the report is to document our project’s theory, implementation and test results. The project works on an automata-based learning system which models authors’ writing characters with automatons. Since there were pervious works done by Dr. T.Y. Lin and Ms. S.X. Zhang, we continue on ALERGIA algorithm analysis and initial common pattern study in this project. Although every author has his/her own writing style, such as sentence length and word frequency etc, there are always some similarities in writing style. We hypothesize that common strings fogged the expected test result, just like the noise in radio wave. This report gives the design and implementation of finding common pattern, as well as testing results. This report also describes the implementation of ALERGIA algorithm based on paper of Learning Stochastic Regular Grammars by Means of a State Merging Method by Rafael C. Carrasco and Jose Oncina [2]. The coding is done in Java 6 on Eclipse Helios version.