Publication Date

Spring 2016

Degree Type

Master's Project

Degree Name

Master of Science (MS)


Computer Science

First Advisor

Mark Stamp

Second Advisor

Thomas Austin

Third Advisor

Fabio Di Troia


image-based bulk email spam SVM PCA


Image spam is unsolicited bulk email, where the message is embedded in an image. This technique is used to evade text-based spam lters. In this research, we analyze and compare two novel approaches for detecting spam images. Our rst approach focuses on the extraction of a broad set of image features and selection of an optimal subset using a Support Vector Machine (SVM). Our second approach is based on Principal Component Analysis (PCA), where we determine eigenvectors for a set of spam images and compute scores by projecting images onto the resulting eigenspace. Both approaches provide high accuracy with low computational complexity. Further, we develop a new spam image dataset that should prove valuable for improving image spam detection capabilities.