Emerging topic detection on Twitter based on temporal and social terms evaluation
|
Tools and Resources
Share: |
||||||||||||||||||||||||
ABSTRACTTwitter is a user-generated content system that allows its users to share short text messages, called tweets, for a variety of purposes, including daily conversations, URLs sharing and information news. Considering its world-wide distributed network of users of any age and social condition, it represents a low level news flashes portal that, in its impressive short response time, has the principal advantage.
In this paper we recognize this primary role of Twitter and we propose a novel topic detection technique that permits to retrieve in real-time the most emergent topics expressed by the community. First, we extract the contents (set of terms) of the tweets and model the term life cycle according to a novel aging theory intended to mine the emerging ones. A term can be defined as emerging if it frequently occurs in the specified time interval and it was relatively rare in the past. Moreover, considering that the importance of a content also depends on its source, we analyze the social relationships in the network with the well-known Page Rank algorithm in order to determine the authority of the users. Finally, we leverage a navigable topic graph which connects the emerging terms with other semantically related keywords, allowing the detection of the emerging topics, under user-specified time constraints. We provide different case studies which show the validity of the proposed approach.
AUTHORS
|
|
|||||||||||||||||||||||||||||||||||||||
| View colleagues of Mario Cataldi | ||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||
| View colleagues of Luigi Di Caro | |||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||
| View colleagues of Claudio Schifanella | |||||||||||||||||||||||||||||||||||||||||
REFERENCESNote: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
|
1
|
Trendistic. http://trendistic.com/.
|
|
|
2
|
Tweet tabs. http://tweettabs.com/.
|
|
|
3
|
Twitter API. http://apiwiki.twitter.com/.
|
|
|
4
|
Twopular. http://twopular.com/.
|
|
|
5
|
Where-what-when. http://where-what-when.husk.org/.
|
|
| |
6
|
|
|
7
|
||
| |
8
|
|
|
9
|
||
| |
10
|
|
|
11
|
C. C. Chen, Y.-T. Chen, Y. S. Sun, and M. C. Chen. Life cycle modeling of news events using aging theory. In ECML, pages 47--59, 2003.
|
|
| |
12
|
|
|
13
|
J. Chen, R. Nairn, L. Nelson, M. Bernstein, and E. Chi. Short and tweet: Experiments on recommending content from information. Atlanta, USA, 2009. ACM Press.
|
|
| |
14
|
Luigi Di Caro , K. Selçuk Candan , Maria Luisa Sapino, Using tagflake for condensing navigable tag hierarchies from tag clouds, Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA [doi>10.1145/1401890.1402021]
|
|
15
|
Alfredo Favenza , Mario Cataldi , Maria Luisa Sapino , Alberto Messina, Topic Development Based Refinement of Audio-Segmented Television News, Proceedings of the 13th international conference on Natural Language and Information Systems: Applications of Natural Language to Information Systems, June 24-27, 2008, London, UK [doi>10.1007/978-3-540-69858-6_23]
|
|
| |
16
|
|
| |
17
|
|
|
18
|
T. L. Griffiths and M. Steyvers. Finding scientific topics. Proceedings of the National Academy of Sciences, 101(Suppl. 1):5228--5235, April 2004.
|
|
|
19
|
A. Hassan, D. Radev, J. Cho, and A. Joshi. Content based recommendation and summarization in the blogosphere. International AAAI Conference on Weblogs and Social Media, 2009.
|
|
|
20
|
||
|
21
|
Robert Jäschke , Leandro Marinho , Andreas Hotho , Lars Schmidt-Thieme , Gerd Stumme, Tag Recommendations in Folksonomies, Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases, September 17-21, 2007, Warsaw, Poland [doi>10.1007/978-3-540-74976-9_52]
|
|
| |
22
|
|
|
23
|
||
|
24
|
||
|
25
|
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. In Proceedings of the 7th International World Wide Web Conference, pages 161--172, Brisbane, Australia, 1998.
|
|
| |
26
|
|
|
27
|
||
|
28
|
||
| |
29
|
|
|
30
|
||
| |
31
|
Canhui Wang , Min Zhang , Liyun Ru , Shaoping Ma, Automatic online news topic ranking using media focus and user attention based on aging theory, Proceedings of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA [doi>10.1145/1458082.1458219]
|
|
32
|
Y. Wu, Y. Ding, X. Wang, and J. Xu. On-line hot topic recommendation using tolerance rough set based topic clustering. Journal of Computers, 5(4), 2010.
|
|
|
33
|
CITED BY102 Citations
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
INDEX TERMSThe ACM Computing Classification System (CCS rev.2012)
PUBLICATION| Title | MDMKDD '10 Proceedings of the Tenth International Workshop on Multimedia Data Mining table of contents | ||||||||||||||||||
| Article No. | 4 | ||||||||||||||||||
| Publication Date | 2010-07-25 (yyyy-mm-dd) | ||||||||||||||||||
| Sponsors | SIGKDD ACM Special Interest Group on Knowledge Discovery in Data | ||||||||||||||||||
| SIGMOD ACM Special Interest Group on Management of Data | |||||||||||||||||||
| Publisher | ACM New York, NY, USA ©2010 | ||||||||||||||||||
| ISBN: 978-1-4503-0220-3 doi>10.1145/1814245.1814249 | |||||||||||||||||||
| Conference |
KDDKnowledge Discovery and Data Mining
|
||||||||||||||||||
| Overall Acceptance Rate 7 of 12 submissions, 58% | |||||||||||||||||||
|
|||||||||||||||||||
REVIEWS
COMMENTSBe the first to comment To Post a comment please sign in or create a free Web account
Table of Contents| Large scale fingerprint mining | |
| Aaron K. Baughman, Stefan Van Der Stockt, Arnold Greenland | |
| Article No.: 1 | |
| doi>10.1145/1814245.1814246 | |
Full text: PDF
|
|
|
Support Vector Machines (SVM) project feature vectors into a linear or non-linear state space using kernel function(s) and attempts to maximize the margin between classes. The projection of feature vectors into a high dimensional hyperspace structure ...
expand
|
|
| Relevance feature mapping for content-based image retrieval | |
| Guang-Tong Zhou, Kai Ming Ting, Fei Tony Liu, Yilong Yin | |
| Article No.: 2 | |
| doi>10.1145/1814245.1814247 | |
Full text: PDF
|
|
|
This paper presents a ranking framework for content-based image retrieval using relevance feature mapping. Each relevance feature measures the relevance of an image to some profile underlying the image database. The framework is a two-stage process. ...
expand
|
|
| Bag of visual words revisited: an exploratory study on robust image retrieval exploiting fuzzy codebooks | |
| Marian Kogler, Mathias Lux | |
| Article No.: 3 | |
| doi>10.1145/1814245.1814248 | |
Full text: PDF
|
|
|
Visual information retrieval systems have gained importance due to the increasing amount of available digital multimedia data. Local features employing a bag of words approach from text retrieval have outperformed global features and have enhanced retrieval ...
expand
|
|
| Emerging topic detection on Twitter based on temporal and social terms evaluation | |
| Mario Cataldi, Luigi Di Caro, Claudio Schifanella | |
| Article No.: 4 | |
| doi>10.1145/1814245.1814249 | |
Full text: PDF
|
|
|
Twitter is a user-generated content system that allows its users to share short text messages, called tweets, for a variety of purposes, including daily conversations, URLs sharing and information news. Considering its world-wide distributed network ...
expand
|
|
| Large scale image clustering with support vector machine based on visual keywords | |
| Tian-Tian Chang, Horace H. S. Ip, Jun Feng | |
| Article No.: 5 | |
| doi>10.1145/1814245.1814250 | |
Full text: PDF
|
|
|
Support Vector Machine Clustering (SVMC) is a model-based clustering method designed primarily for solving 2-class clustering problems. In this paper, we generalize the SVMC method to multi-class clustering via two different strategies, namely One-Against-All ...
expand
|
|
| Example-based event retrieval in video archive using rough set theory and video ontology | |
| Kimiaki Shirahama, Kuniaki Uehara | |
| Article No.: 6 | |
| doi>10.1145/1814245.1814251 | |
Full text: PDF
|
|
|
In this paper, we develop a method for retrieving events of interest in a video archive. To this end, we address the following two issues. First, due to camera techniques, locations and so on, shots of an event contain significantly different features. ...
expand
|
|
| DisIClass: discriminative frequent pattern-based image classification | |
| Sangkyum Kim, Xin Jin, Jiawei Han | |
| Article No.: 7 | |
| doi>10.1145/1814245.1814252 | |
Full text: PDF
|
|
|
Owing to the rapid mounting of massive image data, image classification has attracted lots of research efforts. Several diverse research disciplines have been confluent on this important theme, looking for more powerful solutions. In this paper, we propose ...
expand
|
|
| Measuring performance of web image context extraction | |
| Sadet Alcic, Stefan Conrad | |
| Article No.: 8 | |
| doi>10.1145/1814245.1814253 | |
Full text: PDF
|
|
|
Images on the Web appear with textual contents providing meaningful information to their semantics. Methods that automatically determine and extract the Web image context from an HTML document are widely used in different applications. However, the performance ...
expand
|
|
| Web-scale computer vision using MapReduce for multimedia data mining | |
| Brandyn White, Tom Yeh, Jimmy Lin, Larry Davis | |
| Article No.: 9 | |
| doi>10.1145/1814245.1814254 | |
Full text: PDF
|
|
|
This work explores computer vision applications of the MapReduce framework that are relevant to the data mining community. An overview of MapReduce and common design patterns are provided for those with limited MapReduce background. We discuss both the ...
expand
|
|
| Approximate variable-length time series motif discovery using grammar inference | |
| Yuan Li, Jessica Lin | |
| Article No.: 10 | |
| doi>10.1145/1814245.1814255 | |
Full text: PDF
|
|
|
The problem of identifying frequently occurring patterns, or motifs, in time series data has received a lot of attention in the past few years. Most existing work on finding time series motifs require that the length of the patterns be known in advance. ...
expand
|