Poster: Cloudsweeper: Leveraging Large Language Models to Personalize Sensitive Archive Search

Victor Escuerdo, University of Illinois at Chicago
Sergio Talavera, San Jose State University
Gautam Santhanu Thampy, San Jose State University
Ivan Torres, University of Illinois at Chicago
Daniel Vega Lojo, San Jose State University
Chris Kanich, University of Illinois at Chicago
Magdalini Eirinaki, San Jose State University

Abstract

As cyber threats continue to evolve, overlooked or neglected files stored in cloud services can pose significant risks to personal privacy and data security. In this paper we present Cloudsweeper, a system that aims to improve cloud storage security by creating tools that help users identify and manage sensitive or unwanted files. Cloudsweeper leverages Large Language Models (LLMs) with a Retrieval-Augmented Generation (RAG) architecture to develop a personalized and privacy-focused archive search system. Cloudsweeper represents an innovative approach to secure archive management, balancing user control and privacy in cloud storage environments.