Master of Science (MS)
T. Y. Lin
Based on Kolmogorov Complexity, a finite set x of strings has a pattern if the set x can be output by a Turing machine of length that is less than minimum of all |x|; this Turing machine, that may not be unique, is called a pattern of the finite set of string. In order to find a pattern of a given finite set of strings (assuming such a pattern exists), the ALERGIA algorithm is used to approximate such a pattern (Turing machine) in terms of finite automata. Note that each finite automaton defines a partition on formal language Σ*, ALERGIA algorithm can be viewed as Granular Rough Computing based approximations. Any subset of Σ*, such as DNA, can be approximated by equivalence classes. Based on this view, this thesis analyzes and improves the ALERGIA algorithm via minimization of deterministic finite automaton. Hypothesis testing indicates that the minimization does improve the ALERGIA. So the new method will have high usability in pattern recognition/data mining.
Qi, Xuanyi, "Analysis on ALERGIA Algorithm: Pattern Recognition by Automata Theory" (2016). Master's Projects. 491.