Publication Date

Spring 2019

Degree Type

Master's Project

Degree Name

Master of Science (MS)


Computer Science

First Advisor

Melody Moh

Second Advisor

Teng Moh

Third Advisor

Chris Pollett


audio captchas, BIM, DeepFool


CAPTCHA is a web-based authentication method used by websites to distinguish between humans (valid users) and bots(attackers). Audio captcha is an accessible captcha meant for the visually disabled section of users such as color-blind, blind, near-sighted users. In this project, I analyzed the security of audio captchas from attacks that employ machine learning and deep learning models. Audio captchas of varying lengths (5, 7 and 10) and varying background noise (no noise, medium noise or high noise) were analyzed. I found that audio captchas with no background noise or medium background noise were easily attacked with 99% - 100% accuracy. Whereas, audio captchas with high noise were relatively more secure with breaking accuracy of 85%. I also propose that adversarial example attacks can be used in favor of audio captcha, that is, adversarial example attacks can be used to defend audio captcha from attackers. I explored two adversarial examples attack algorithms: Basic Iterative Method (BIM) and DeepFool method to create new adversarial audio captcha. Finally, I analyzed the security of these newly created adversarial audio captcha by simulating Level I and Level II defense scenarios. Level I defense is a defense against pre- trained models that have never seen adversarial examples before. Whereas a Level II defense is a defense against models that have been re-trained on adversarial examples. My experiments show that Level I defense can prevent nearly 100% of attacks from pre-trained models. It also proves that Level II defense increases security of audio captcha by 57% to 67%. Real world scenarios such as multi-retries are also studied and related defense mechanism are suggested.