Publication Date
Fall 2023
Degree Type
Master's Project
Degree Name
Master of Science in Computer Science (MSCS)
Department
Computer Science
First Advisor
Chris Pollett
Second Advisor
Robert Chun
Third Advisor
William Andreopoulos
Keywords
Survival analysis, Time-Series Data, Benchmarking Suite, Time-Series Databases, NoSQL Databases
Abstract
Survival analysis data is crucial for predicting future events and making informed decisions. Storing this data in databases enables researchers and analysts to easily access and analyze it, facilitating more accurate predictions and better decision-making. There is a growing demand to store such data utilizing databases. While benchmarking tools are available to aid in selecting the appropriate database, there is currently no benchmarking suite designed explicitly for survival analysis data. In this report, I present the development and analysis of a benchmarking suite for survival analysis data. The suite encompasses performance metrics for both read and write operations and has been applied to several popular databases, including QuestDB, TimescaleDB, Cassandra, and MongoDB. Specialized topics related to survival analysis, such as Log-Rank, Cox Proportional Hazards, and Kaplan-Meier, were given significant attention. Using the suite, I compared NoSQL databases with time-series databases for storing and retrieving survival analysis data. The project's findings reveal differences as NoSQL databases don’t perform as well as time series databases. Although NoSQL databases are generally useful, certain survival analysis queries are unresponsive. TimescaleDB performs exceptionally well across various queries, indicating its suitability for time-dependent data scenarios. The comparative
analysis highlights the importance of selecting databases tailored to the specific data needs of survival analysis. It recognizes that specialized time-series databases have an advantage in this area.
Recommended Citation
Patel, Aarsh, "Database Benchmarking Suite for Survival Analysis Data" (2023). Master's Projects. 1319.
DOI: https://doi.org/10.31979/etd.fvh4-xkxs
https://scholarworks.sjsu.edu/etd_projects/1319