Publication Date
Spring 2024
Degree Type
Master's Project
Degree Name
Master of Science in Computer Science (MSCS)
Department
Computer Science
First Advisor
William Andreopoulos
Second Advisor
Genya Ishigaki
Third Advisor
Elton DSouza
Keywords
Large Language Models, Environmental Health and Safety, LLaMA, Mistral, Falcon, Parameter-Efficient Fine-Tuning, QLoRA, Supervised Fine-Tuning, Safety compliance, Risk Assessment
Abstract
This study aims to simplify Environmental Health and Safety (EHS) by leveraging the power of Large Language Models (LLMs). In this research, we focus on fine-tuning three LLMs — LLaMA, Mistral, and Falcon — using PEFT techniques such as QLoRA and SFT, to address domain-specific needs such as safety compliance, incident reporting, and knowledge dissemination. Our research methodology involves fine-tuning each LLM model on a custom dataset compiled from various regulatory agencies, supplemented by targeted web scraping and manual collection of questionnaires to capture and enrich the models with the latest regulations and guidelines. This study aims to compare the effectiveness of these fine-tuned models to identify the most effective model and fine-tuning techniques for specific EHS applications. We aim to integrate LLMs to support EHS practices and demonstrate a practical way of improving and automating EHS queries and reports to foster safer and more compliant workplace environments.
Recommended Citation
Ansari, Mohammad Adil, "Enhancing Environmental Health and Safety: Fine-Tuning Large Language Models for Domain-Specific Applications" (2024). Master's Projects. 1362.
DOI: https://doi.org/10.31979/etd.wgkk-f7cb
https://scholarworks.sjsu.edu/etd_projects/1362