Publication Date

Fall 2025

Degree Type

Master's Project

Degree Name

Master of Science in Computer Science (MSCS)

Department

Computer Science

First Advisor

Mark Stamp

Second Advisor

Jelena Gligorijevic

Third Advisor

Katerina Potika

Keywords

Concept Drift, Model Retraining, Malware Detection, One-Class Support Vector Machines, Minibatch K-Means

Abstract

Concept drift refers to changes over time in the statistical properties of data compared to the data used to train a learning model. Machine learning models for malware detection are particularly susceptible to performance degradation due toconcept drift as attackers continually modify existing malware. We consider two unsupervised machine learning approaches to automated concept drift detection: One-Class Support Vector Machines (OCSVM) and Minibatch K-Means (MK-Means). We compare these techniques to Maximum Mean Discrepancy (MMD), a distribution shift statistical technique. We conduct experiments comparing four models (MLP, RF, SVM, XGB) on the KronoDroid malware dataset across three scenarios: static (no retraining), drift-aware (retraining when drift is detected), and periodic (constant retraining). In most cases, drift-aware retraining based on OCSVM, MK-Means, or MMD performs almost as well as periodic retraining while requiring far fewer models to retrain.

Available for download on Saturday, December 19, 2026

Share

COinS