Publication Date

Spring 2016

Degree Type

Master's Project

Degree Name

Master of Science (MS)

Department

Computer Science

First Advisor

Mark Stamp

Second Advisor

Thomas Austin

Third Advisor

Fabio Di Troia

Keywords

image-based bulk email spam SVM PCA

Abstract

Image spam is unsolicited bulk email, where the message is embedded in an image. This technique is used to evade text-based spam lters. In this research, we analyze and compare two novel approaches for detecting spam images. Our rst approach focuses on the extraction of a broad set of image features and selection of an optimal subset using a Support Vector Machine (SVM). Our second approach is based on Principal Component Analysis (PCA), where we determine eigenvectors for a set of spam images and compute scores by projecting images onto the resulting eigenspace. Both approaches provide high accuracy with low computational complexity. Further, we develop a new spam image dataset that should prove valuable for improving image spam detection capabilities.

Share

COinS