Publication Date
Spring 2025
Degree Type
Master's Project
Degree Name
Master of Science in Computer Science (MSCS)
Department
Computer Science
First Advisor
Katerina Potika
Second Advisor
Thomas Austin
Third Advisor
Fabio di Troia
Keywords
natural language techniques, nlp, graph classification, suicidal ideation, reddit, gcn, llm, large language model, dataset, SuicideWatch, deep learning, lstm, word2vec
Abstract
Suicide is the fourth leading cause of death among people aged 15-29. More than 720, 000 people commit suicide every year. During the COVID-19 pandemic, we saw an increase in people seeking out mental health support on anonymous forums like Reddit. These anonymous forums allow people to express their suicidal ideation without judgment and give them a support structure that not everyone has. The aim of this project is to detect suicidal ideation using Reddit. In this work, we propose SIRGEL (Suicidal Ideation on Reddit using Graph Embeddings and LLMs), a dual-pipeline approach that combines large language model (LLM)- based annotation with both traditional text classification and graph neural network (GNN)-based modeling. The first step is to create a Reddit dataset for related subreddits. The second step is to annotate them as suicidal or not suicidal using LLMs. We also detail the creation of a new dataset made using Reddit posts and its benchmarking using baseline machine learning models. We model this problem using two approaches: traditional text classification and graph representation learning. Additionally, experiments with different types of Graph Neural Networks are performed with optimizations to improve and achieve efficient results
Recommended Citation
Dhanjal, Ikbal Singh Gurdev Singh, "Suicidal Ideation Detection on Reddit using LLM-Annotated Data and Graph Neural Networks" (2025). Master's Projects. 1548.
DOI: https://doi.org/10.31979/etd.m3fm-p9ex
https://scholarworks.sjsu.edu/etd_projects/1548