Publication Date

Spring 2024

Degree Type

Master's Project

Degree Name

Master of Science in Computer Science (MSCS)

Department

Computer Science

First Advisor

Ching-Seh Wu

Second Advisor

Navrati Saxena

Third Advisor

Robert Chun

Keywords

Bird Species Identification, Deep Learning, Audio Classification, MixIT, Source Separation, Noise Reduction, Transformers, Audio Spectrogram Transformer, EfficientNet

Abstract

The identification of bird species using deep learning techniques presents a novel approach in bioacoustics, by significantly advancing our understanding and enhancing our capabilities in bird species recognition from audio recordings. The value of audio over visual data for monitoring ecological patterns in birds can be highlighted with the deployment of automated recording devices in remote wildlife sensing, offering a more cost-effective, non-invasive, and practical solution. However, the methods of processing and classifying the audio remain challenging due to the complexity of bird audio, characterized by diverse vocalizations and imminent environmental noise, which poses difficult challenges to perform effective classification. The rapidly

evolving field of machine learning, particularly deep learning, has shown promis- ing results in processing and interpreting complex audio data. With the use of a

sound separation technique in audio processing known as Mixture Invariant Train- ing (MixIT), the potential for accurate and efficient bird species identification is

enhanced. Bird audio classification was performed on different variations of deep learning Convolutional Neural Network (CNN) models like EfficientNet and ResNet, and Transformer-based models like Audio-Spectrogram Transformer (AST), Vision Transformer and Wav2Vec2 Transformer. From the findings, it is seen that applying deep learning on MixIT processed data improved accuracy by 12%, from 70.53% to 82.49%, for Audio-Spectrogram Transformer and by 16%, from 65.22% to 81.19%, for EfficientNet.

Recommended Citation

Kosuru, Sasanka, "BIRDSONG CLASSIFICATION USING DEEP LEARNING AND MIXIT" (2024). Master's Projects. 1376.
DOI: https://doi.org/10.31979/etd.7v78-vrv5
https://scholarworks.sjsu.edu/etd_projects/1376

Download

Available for download on Friday, May 23, 2025

Included in

Other Computer Engineering Commons

COinS

DOI

https://doi.org/10.31979/etd.7v78-vrv5

Master's Projects

BIRDSONG CLASSIFICATION USING DEEP LEARNING AND MIXIT

Publication Date

Degree Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Keywords

Abstract

Recommended Citation

Included in

DOI

Search

Browse All

Links

Master's Projects

BIRDSONG CLASSIFICATION USING DEEP LEARNING AND MIXIT

Author

Publication Date

Degree Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Keywords

Abstract

Recommended Citation

Included in

Share

DOI

Search

Browse All

Links