Publication Date
Fall 2021
Degree Type
Thesis
Degree Name
Master of Science (MS)
Department
Electrical Engineering
Advisor
Birsen Sirkeci
Subject Areas
Electrical engineering
Abstract
Machine learning used in the medical industry can potentially detect cancer in humancells at an early stage. However, training the machine learning models, especially deep learning models require thousands to millions of samples in order to reach an acceptable accuracy level. It is well-know that obtaining medical data is tedious hence in most cases, medical datasets have limited number of data samples. One solution for this problem is utilizing transfer learning such as pretrained networks on another dataset. Another solution is to increase the number of training data points with data augmentation. Common data augmentation methods for images include not only simple techniques such as transforming images using rotation and flipping, but also generative adversarial networks (GANs). However, one critical question is “Does the original dataset have enough to train a GAN?”. In most scenarios, the answer is “No” for this critical question. In this thesis, we propose a two-level data augmentation technique (simple data augmentation based on image transformations followed by a GAN) with transfer learning, which is tested on a small dataset of cancer cell images. The dataset used in this research consists of lung and colon cancer samples, each containing different types of cancers. Only part of the original dataset is used for experimenting in order to mimic small dataset environment. Our results show that the proposed method is able to achieve an accuracy of 94.1% even when 150 original images used for training. This is very close to 97.33% accuracy achieved if one uses all the available training data which is 12000 samples.
Recommended Citation
Pudota, Nihil, "Two-Level Data Augmentation with Transfer Learning for Classification of Medical Images with Limited Data" (2021). Master's Theses. 5243.
DOI: https://doi.org/10.31979/etd.9462-4z5q
https://scholarworks.sjsu.edu/etd_theses/5243