Off-campus SJSU users: To download campus access theses, please use the following link to log into our proxy server with your SJSU library user name and PIN.
Publication Date
Spring 2024
Degree Type
Thesis - Campus Access Only
Degree Name
Master of Science (MS)
Department
Computer Engineering
Advisor
Stas Tiomkin; Jorjeta Jetcheva; Carlos Rojas
Abstract
Denoising diffusion probabilistic models have emerged as a powerful class of density modeling techniques. Characterized by their foundations in non-equilibrium thermodynamics, they are capable of modeling complex data distributions and generating novel samples. Their success in high quality sampling has resulted in significant research in sampling efficiency, improved estimation, and model control. Information theory provides tools for the exploration of diffusion models generative dynamics. Specifically, we explore two domains; Information-imbalanced data sets and score functions for recommendation systems. Through our exploration we observe an interesting phenomenon where certain classes of training data are more likely to be reconstructed than others. We propose information-theoretic reasoning as to why this phenomenon emerges across data sets and posit potential solutions to counteract this observation. We then apply denoising diffusion probabilistic models to recommender systems. We introduce a Score-based Diffusion Recommender Module (SDRM) to generate synthetic data for recommendation systems which accurately captures the sparse nature of this training data, while respecting user privacy. We show our generated samples are capable of fully replacing and or augmenting the initial training data, while boosting recommender model performance by an average improvement of 4.5% in both Recall@k and NDCG@k while retaining user privacy by achieving 99% dissimilarity.
Recommended Citation
Mello, Paul, "An Exploration of Information Processing In Diffusion Models" (2024). Master's Theses. 5517.
DOI: https://doi.org/10.31979/etd.gm7a-6rvy
https://scholarworks.sjsu.edu/etd_theses/5517