Master of Science (MS)
image neural net attacks, adversarial retraining techniques
With recent advancements in the field of artificial intelligence, deep learning has created a niche in the technology space and is being actively used in autonomous and IoT systems globally. Unfortunately, these deep learning models have become susceptible to adversarial attacks which can severely impact their integrity. Research has shown that many state-of-the-art models are vulnerable to attacks by well-crafted adversarial examples. These adversarial examples are perturbed versions of clean data which have small amount of noise added to them. These adversarial samples are imperceptible to the human eye but can easily fool the targeted model. The exposed vulnerabilities of these models raise the question of their usability in safety-critical real-world applications such as autonomous driving and applications in the field of medicine. In this work, I have documented the effectiveness of six different gradient based adversarial attacks on ResNet image recognition model. Defending against these adversaries is a difficult problem and adversarial retraining has been one of the widely used defense technique. Adversarial retraining aims at training a more robust model that is capable of handling the adversarial examples attack proactively. I demonstrate the limitations of the traditional adversarial retraining technique which is effective against some adversaries but fails against more sophisticated attacks. I present a new ensemble defense strategy using adversarial retraining technique which is capable of withstanding six adversarial attacks on cifar10 dataset with accuracy greater than 89.31% and as high as 96.24%.
Mani, Nag, "On Adversarial Attacks on Deep Learning Models" (2019). Master's Projects. 742.