Faculty Research, Scholarly, and Creative Activity

Analyzing and Addressing Data-driven Fairness Issues in Machine Learning Models used for Societal Problems

Vishnu S. Pendyala, San Jose State UniversityFollow
HyungKyun Kim, San Jose State University

Publication Date

1-21-2023

Document Type

Conference Proceeding

Publication Title

2023 International Conference on Computer, Electrical & Communication Engineering (ICCECE)

DOI

10.1109/ICCECE51049.2023.10085470

Abstract

This work aims to systematically analyze and address fairness issues arising in machine learning models because of class imbalances present in data, specifically used for addressing societal problems and providing unique insights. Using a specific data set, spectral analysis is first performed to present evidence and characterize the fairness issues. Subsequently, a series of class imbalance correction techniques are applied before the data is used to generate various machine learning models. The models so generated are then evaluated using multiple metrics. The results are then analyzed to compare the various approaches to determine the relative merits of each. As the experiments described in this paper confirm, not all oversampling techniques help in correcting data-induced model biases. Based on the Kappa statistic, F-1 score, and accuracy measured by the area under the Receiver Operating Characteristic curve, among the approaches evaluated, the Majority Weighted Minority Oversampling Technique, MWMOTE oversampling technique addresses the fairness issues the best and also improves the performance of the models at least for the dataset in consideration. The experiments also demonstrate that some of the oversampling techniques can degrade the models both in terms of performance and fairness. The results are interpreted using the evaluation metrics.

Comments

© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Department

Applied Data Science

Recommended Citation

Vishnu S. Pendyala and HyungKyun Kim. "Analyzing and Addressing Data-driven Fairness Issues in Machine Learning Models used for Societal Problems" 2023 International Conference on Computer, Electrical & Communication Engineering (ICCECE) (2023). https://doi.org/10.1109/ICCECE51049.2023.10085470

Download

Find in your library

COinS

Faculty Research, Scholarly, and Creative Activity

Analyzing and Addressing Data-driven Fairness Issues in Machine Learning Models used for Societal Problems

Publication Date

Document Type

Publication Title

DOI

Abstract

Comments

Department

Recommended Citation

Search

Browse All

Links

Faculty Research, Scholarly, and Creative Activity

Analyzing and Addressing Data-driven Fairness Issues in Machine Learning Models used for Societal Problems

Authors

Publication Date

Document Type

Publication Title

DOI

Abstract

Comments

Department

Recommended Citation

Share

Search

Browse All

Links