Publication Date

Fall 2010

Degree Type

Master's Project

Degree Name

Master of Science (MS)


Computer Science

First Advisor

Mark Stamp

Second Advisor

Robert Chun

Third Advisor

Teng Moh


intrusion detection systems hidden markov model


In modern computer systems, usernames and passwords have been by far the most common forms of authentication. A security system relying only on password protection is defenseless when the passwords of legitimate users are compromised. A masquerader can impersonate a legitimate user by using a compromised password.

An intrusion detection system (IDS) can provide an additional level of protection for a security system by inspecting user behavior. In terms of detection techniques, there are two types of IDSs: signature-based detection and anomaly-based detection. An anomaly-based intrusion detection technique consists of two steps: 1) creating a normal behavior model for legitimate users during the training process, 2) analyzing user behavior against the model during the detection process.

In this project, we concentrate on masquerade detection, a specific type of anomaly-based IDS. We have first explored suitable techniques to build a normal behavior model for masquerade detection. After studying two existing modeling techniques, N-gram frequency and hidden Markov models (HMMs), we have developed a novel approach based on profile hidden Markov models (PHMMs). Then we have analyzed these three approaches using the classical Schonlau data set. To find the best detection results, we have also conducted sensitivity analysis on the modeling parameters. However, we have found that our proposed PHMMs do not outperform the corresponding HMMs. We conjectured that Schonlau data set lacked the position information required by the PHMMs. To verify this conjecture, we have also generated several data sets with position information. Our experimental results show that when there is no sufficient training data, the PHMMs yield considerably better detection results than the iv corresponding HMMs since the generated position information is significantly helpful for the PHMMs.