Reconstructing Classification to Enhance Machine-Learning Based Network Intrusion Detection by Embracing Ambiguity

Publication Date

1-1-2021

Document Type

Conference Proceeding

Publication Title

Communications in Computer and Information Science

Volume

1383 CCIS

DOI

10.1007/978-3-030-72725-3_13

First Page

169

Last Page

187

Abstract

Network intrusion detection systems (IDS) has efficiently identified the profiles of normal network activities, extracted intrusion patterns, and constructed generalized models to evaluate (un)known attacks using a wide range of machine learning approaches. In spite of the effectiveness of machine learning-based IDS, it has been still challenging to reduce high false alarms due to data misclassification. In this paper, by using multiple decision mechanisms, we propose a new classification method to identify misclassified data and then to classify them into three different classes, called a malicious, benign, and ambiguous dataset. In other words, the ambiguous dataset contains a majority of the misclassified dataset and is thus the most informative for improving the model and anomaly detection because of the lack of confidence for the data classification in the model. We evaluate our approach with the recent real-world network traffic data, Kyoto2006+ datasets, and show that the ambiguous dataset contains 77.2% of the previously misclassified data. Re-evaluating the ambiguous dataset effectively reduces the false prediction rate with minimal overhead and improves accuracy by 15%.

Funding Number

18-086

Funding Sponsor

National Science Foundation

Keywords

Ensemble classifiers, Machine learning, Network intrusion detection

Department

Computer Engineering

Share

COinS