Publication Date
Spring 2024
Degree Type
Master's Project
Degree Name
Master of Science in Computer Science (MSCS)
Department
Computer Science
First Advisor
Fabio Di Troia
Second Advisor
Robert Chun
Third Advisor
William B. Andreopoulos
Keywords
Maliciousness of URLs, SHAP, LIME
Abstract
No system has ever reached the levels of proliferation that the Internet now enjoys. It stands as the most widely spread distributed system across the globe; yet this evolution has given rise to an ever-growing wave of malintent that challenges every user and entity on the vast expanse of cyberspace. Malicious URLs loom large as vulnerabilities leaving users naked as they traverse online landscapes, but cybersecurity experts craft models with esoteric algorithms in a bid to stem this tide and shield users from cybercrime. However, peering into the decision-making corridors of these models holds key importance, it’s through understanding such cognitive landscapes that robust forward protectors for users and platforms can be erected. Machine learning models are often dubbed black boxes; but not in our case. The term black boxes is often used to describe machine learning models as their workings are hidden from view; however, this paper delves into this very topic by investigating the interpretability of machine learning models with a specific focus on their use in detecting malicious URLs. Among the various models considered, attention is paid to identifying the most effective one out of a pool of five: MLP, deep models, RandomForest Classifier, SVM, and XGBoost through SHAP and LIME techniques - hoping that this dual approach will illuminate differing aspects regarding each model’s operation. By conducting an in-depth analysis and juxtaposition between these methodologies (SHAP and LIME), it is hoped that more light will be shed on how these models work differently and where exactly one can draw precise cybersecurity decisions from.
Recommended Citation
Nair, Ayush, "Explaining the Maliciousness of URLs using SHAP and LIME" (2024). Master's Projects. 1383.
DOI: https://doi.org/10.31979/etd.mhmn-6syz
https://scholarworks.sjsu.edu/etd_projects/1383