Publication Date
Fall 2025
Degree Type
Master's Project
Degree Name
Master of Science in Computer Science (MSCS)
Department
Computer Science
First Advisor
Fabio Di Troia
Second Advisor
William Andreopoulos
Third Advisor
Ravi Teja
Keywords
Encrypted Traffic Detection, HTTPS Malware, TLS Metadata, X.509 Features, Machine Learning, Zeek Logs
Abstract
Encrypted HTTPS traffic now dominates the Internet, and malware increasingly uses TLS to conceal command-and-control activity. Since payloads cannot be inspected, detection must rely on metadata such as TLS handshake fields and certificate attributes, which prior work has shown can still reveal malicious behavior. This research evaluates whether malicious HTTPS connections can be detected using only metadata from Zeek logs. Using the CTU-SME-11 dataset, we build a reproducible preprocessing pipeline and a 33-feature connection-level representation capturing flow statistics, TLS behavior, and certificate validity characteristics. We evaluate XGBoost, multilayer perceptrons, and several CNN variants - including 1D and 2D grid-based embeddings - using a stratified capture-level split and 5-fold capture-aware cross-validation to prevent leakage. Results show strong discriminative performance, with XGBoost achieving the highest ROC-AUC and PR-AUC, and CNN-based models, particularly an 8×8 architecture, achieving the strongest malicious-class F1-scores. These findings show that metadata-based models can accurately detect encrypted malicious traffic and motivate future work on generalization, calibration and explainability.
Recommended Citation
Pasari, Suyash, "Detecting Malicious Encrypted Network Traffic Using Deep Learning and CNN-Based Feature Representations" (2025). Master's Projects. 1598.
DOI: https://doi.org/10.31979/etd.chtp-zqkt
https://scholarworks.sjsu.edu/etd_projects/1598