Image spam analysis and detection

Publication Date

February 2018

Document Type

Article

Publication Title

Journal of Computer Virology and Hacking Techniques

Volume

14

Issue

1

DOI

10.1007/s11416-016-0287-x

First Page

39

Last Page

52

Abstract

Image spam is unsolicited bulk email, where the message is embedded in an image. Spammers use such images to evade text-based filters. In this research, we analyze and compare two methods for detecting spam images. First, we consider principal component analysis (PCA), where we determine eigenvectors corresponding to a set of spam images and compute scores by projecting images onto the resulting eigenspace. The second approach focuses on the extraction of a broad set of image features and selection of an optimal subset using support vector machines (SVM). Both of these detection strategies provide high accuracy with low computational complexity. Further, we develop a new spam image dataset that cannot be detected using our PCA or SVM approach. This new dataset should prove valuable for improving image spam detection capabilities.

Share

COinS