Publication Date
Spring 2025
Degree Type
Master's Project
Degree Name
Master of Science in Computer Science (MSCS)
Department
Computer Science
First Advisor
William Andreopoulos
Second Advisor
Thomas Austin
Third Advisor
Katerina Potika
Keywords
Biomedical NLP, Relation Extraction, Named Entity Recognition, Knowledge Graph, Large Language Models, PubMed
Abstract
Rapid release in biomedical literature poses a challenge in linking information. This thesis aims to extract data from expanding datasets to identify and form meaningful relationships between biomedical entities. Large language models (LLMs) enable us to learn at a rapid pace. Creation of LLms from scratch are impractical. This thesis aims to collect a small dataset, containing biomedical papers, and use it to train large language models (LLMs) to extract entities from the text and learn the relationships between these entities. The experiment will be divided into two stages and utilize EU-ADR and ChemProt dataset. Starting with named entity recognition (NER), cleaned datasets will be inserted through four LLMs. To determine the best results, data will be inserted through training relation extraction (RE) models, followed by a display of results graphs and visualization.
Recommended Citation
Tran, Brian, "Extraction of a Knowledge Graph of Biomedical Relationships" (2025). Master's Projects. 1537.
DOI: https://doi.org/10.31979/etd.jug5-58vb
https://scholarworks.sjsu.edu/etd_projects/1537