Files
Download Full Text (8.1 MB)
Description
How are machine learning algorithms able to answer questions from any nook and corner of the World Wide Web? How are trending hashtags from the near infinite microblog posts, unique visitors and other distinct counts in the near infinite website traffic determined? How do blogging websites avoid recommending articles a user has previously read? In general, how can we answer complex queries about enormous data streams without storing them entirely, in real-time? The answer often lies in clever approximation algorithms and data "sketches" that capture essential properties using vastly reduced space. The relentless flow of data in modern systems indeed presents significant challenges. These data streams are often too large to store and too fast to process exhaustively with traditional methods. This talk introduces key sketching and approximation techniques that help generate real-time data insights by processing data streams.
Publication Date
Summer 8-19-2025
Document Type
Presentation
Creative Commons License
This work is licensed under a Creative Commons Attribution-No Derivative Works 4.0 License.
Recommended Citation
Pendyala, Vishnu S., "The Sketches of Infinite Data and Algorithms for Real-Time Data Insights" (2025). Open Educational Resources. 15.
https://scholarworks.sjsu.edu/oer/15
