"Time-Based Ensembles for Prediction of Rare Events in News Streams" by Nuno Moniz, Luís Torgo et al.

Faculty Publications

Title

Time-Based Ensembles for Prediction of Rare Events in News Streams

Authors

Nuno Moniz, University of Porto
Luís Torgo, University of Porto
Magdalini Eirinaki, San Jose State University

Document Type

Article

Publication Date

January 2016

Publication Title

Proceedings of the 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)

First Page

1066

Last Page

1073

DOI

10.1109/ICDMW.2016.0154

Abstract

Thousands of news are published everyday reporting worldwide events. Most of these news obtain a low level of popularity and only a small set of events become highly popular in social media platforms. Predicting rare cases of highly popular news is not a trivial task due to shortcomings of standard learning approaches and evaluation metrics. So far, the standard task of predicting the popularity of news items has been tackled by either of two distinct strategies related to the publication time of news. The first strategy, a priori, is focused on predicting the popularity of news upon their publication when related social feedback is unavailable. The second strategy, a posteriori, is focused on predicting the popularity of news using related social feedback. However, both strategies present shortcomings related to data availability and time of prediction. To overcome such shortcomings, we propose a hybrid strategy of time-based ensembles using models from both strategies. Using news data from Google News and popularity data from Twitter, we show that the proposed ensembles significantly improve the early and accurate prediction of rare cases of highly popular news.

Comments

This is the Accepted Version of an article published in the Proceedings of the 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW). The Version of Record is available online at this link.
© 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
SJSU users: use the following link to login and access the article via SJSU databases.

Recommended Citation

Nuno Moniz, Luís Torgo, and Magdalini Eirinaki. "Time-Based Ensembles for Prediction of Rare Events in News Streams" Proceedings of the 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) (2016): 1066-1073. https://doi.org/10.1109/ICDMW.2016.0154

Download

Included in

Computer Engineering Commons, Computer Sciences Commons

COinS