Off-campus SJSU users: To download campus access theses, please use the following link to log into our proxy server with your SJSU library user name and PIN.

Publication Date

Spring 2024

Degree Type

Thesis - Campus Access Only

Degree Name

Master of Science (MS)

Department

Computer Engineering

Advisor

Stas Tiomkin; Jorjeta Jetcheva; Carlos Rojas

Abstract

Denoising diffusion probabilistic models have emerged as a powerful class of density modeling techniques. Characterized by their foundations in non-equilibrium thermodynamics, they are capable of modeling complex data distributions and generating novel samples. Their success in high quality sampling has resulted in significant research in sampling efficiency, improved estimation, and model control. Information theory provides tools for the exploration of diffusion models generative dynamics. Specifically, we explore two domains; Information-imbalanced data sets and score functions for recommendation systems. Through our exploration we observe an interesting phenomenon where certain classes of training data are more likely to be reconstructed than others. We propose information-theoretic reasoning as to why this phenomenon emerges across data sets and posit potential solutions to counteract this observation. We then apply denoising diffusion probabilistic models to recommender systems. We introduce a Score-based Diffusion Recommender Module (SDRM) to generate synthetic data for recommendation systems which accurately captures the sparse nature of this training data, while respecting user privacy. We show our generated samples are capable of fully replacing and or augmenting the initial training data, while boosting recommender model performance by an average improvement of 4.5% in both Recall@k and NDCG@k while retaining user privacy by achieving 99% dissimilarity.

Available for download on Friday, August 15, 2025

Share

COinS