Abstract
DanMu, an emerging type of user-generated comment, has become increasingly popular in recent years. Many online video platforms such as Tudou.com have provided the DanMu function. Unlike traditional online reviews such as reviews at Youtube.com that are outside the videos, DanMu is a scrolling marquee comment, which is overlaid directly on top of the video and synchronized to a specific playback time. Such comments are displayed as streams of moving subtitles overlaid on the video screen. Viewers could easily write DanMus while watching videos, and the written DanMus will be immediately overlaid onto the video and displayed to writers themselves and other viewers as well. Such DanMu systems have greatly enabled users to communicate with each other in a much more direct way, creating a real-time sharing experience. Although there are several unique features of DanMu and has had a great impact on online video systems, to the best of our knowledge, there is no work that has provided a comprehensive study on DanMu. In this article, as a pilot study, we analyze the unique characteristics of DanMu from various perspectives. Specifically, we first illustrate some unique distributions of DanMus by comparing with traditional reviews (TReviews) that we collected from a real DanMu-enabled online video system. Second, we discover two interesting patterns in DanMu data: a herding effect and multiple-burst phenomena that are significantly different from those in TRviews and reveal important insights about the growth of DanMus on a video. Towards exploring antecedents of both th herding effect and multiple-burst phenomena, we propose to further detect leading DanMus within bursts, because those leading DanMus make the most contribution to both patterns. A framework is proposed to detect leading DanMus that effectively combines multiple factors contributing to leading DanMus. Based on the identified characteristics of DanMu, finally we propose to predict the distribution of future DanMus (i.e., the growth of DanMus), which is important for many DanMu-enabled online video systems, for example, the predicted DanMu distribution could be an indicator of video popularity. This prediction task includes two aspects: One is to predict which videos future DanMus will be posted for, and the other one is to predict which segments of a video future DanMus will be posted on. We develop two sophisticated models to solve both problems. Finally, intensive experiments are conducted with a real-world dataset to validate all methods developed in this article.
- Maria Andersson, Carmen Lee, Ted Martin Hedesström, and Tommy Gärling. 2006. Effects of reward system on herding in a simulated financial market. Interaction on the Edge (2006), 12.Google Scholar
- Abhijit V. Banerjee. 1992. A simple model of herd behavior. Quart. J. Econ. 107, 3 (1992), 797--817.Google Scholar
Cross Ref
- Albert-László Barabási and Réka Albert. 1999. Emergence of scaling in random networks. Science 286, 5439 (1999), 509--512.Google Scholar
- Youmna Borghol, Sebastien Ardon, Niklas Carlsson, Derek Eager, and Anirban Mahanti. 2012. The untold story of the clones: Content-agnostic factors that impact youtube video popularity. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1186--1194. Google Scholar
Digital Library
- Meeyoung Cha, Haewoon Kwak, Pablo Rodriguez, Yong-Yeol Ahn, and Sue Moon. 2009a. Analyzing the video popularity characteristics of large-scale user generated content systems. IEEE/ACM Trans. Netw. 17, 5 (2009), 1357--1370. Google Scholar
Digital Library
- Meeyoung Cha, Alan Mislove, and Krishna P. Gummadi. 2009b. A measurement-driven analysis of information propagation in the flickr social network. In Proceedings of the 18th International Conference on World Wide Web. ACM, 721--730. Google Scholar
Digital Library
- Kenny K. Chan and Shekhar Misra. 1990. Characteristics of the opinion leader: A new dimension. J. Advert. 19 (1990), 53--60.Google Scholar
Cross Ref
- William G. Christie and Roger D. Huang. 1995. Following the pied piper: Do individual returns herd around the market? Financ. Analyst. J. 51, 4 (1995), 31--37.Google Scholar
Cross Ref
- Aaron Clauset, Cosma Rohilla Shalizi, and Mark E. J. Newman. 2009. Power-law distributions in empirical data. SIAM Rev. 51, 4 (2009), 661--703. Google Scholar
Digital Library
- Jacob Cohen, Patricia Cohen, Stephen G. West, and Leona S Aiken. 2013. Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. Routledge.Google Scholar
- Flavio Figueiredo, Fabrício Benevenuto, and Jussara M. Almeida. 2011. The tube over time: Characterizing popularity growth of youtube videos. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining. ACM, 745--754. Google Scholar
Digital Library
- Joshua Hawthorne, J. Brian Houston, and Mitchell S. McKinney. 2013. Live-tweeting a presidential primary debate: Exploring new political conversations. Social Science Computer Review 31, 5 (2013), 552--562. Google Scholar
Digital Library
- John D. Hey and Andrea Morone. 2004. Do markets drive out lemmings or vice versa? Economica 71, 284 (2004), 637--659.Google Scholar
Cross Ref
- Sandra Hsieh, Yang Yu Tai, and Tam Bang Vu. 2008. Do herding behavior and positive feedback effects influence capital inflows? Evidence from Asia and Latin America. Int. J. Bus. Financ. Res. 2, 2 (2008), 19--34.Google Scholar
- Andreas Jungherr. 2015. The media connection. In Analyzing Political Communication with Digital Trace Data. Springer, 155--188.Google Scholar
- Jon Kleinberg. 2003. Bursty and hierarchical structure in streams. Data Min. Knowl. Discov. 7, 4 (2003), 373--397. Google Scholar
Digital Library
- Feng Li and Timon C Du. 2011. Who is talking? An ontology-based opinion leader identification framework for word-of-mouth marketing in online social blogs. Decision Support Syst. 51, 1 (2011), 190--197. Google Scholar
Digital Library
- Huayu Li, Yong Ge, Richang Hong, and Hengshu Zhu. 2016. Point-of-interest recommendations: Learning potential check-ins from friends. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 975--984. Google Scholar
Digital Library
- Huayu Li, Richang Hong, Shiai Zhu, and Yong Ge. 2015. Point-of-interest recommender systems: A separate-space perspective. In 2015 IEEE International Conference on Data Mining (ICDM’15). 231--240. Google Scholar
Digital Library
- Qi Liu, Enhong Chen, Hui Xiong, Chris H. Q. Ding, and Jian Chen. 2012. Enhancing collaborative filtering by user interest expansion via personalized ranking. IEEE Trans. Syst. Man Cybernet. B 42, 1 (2012), 218--233. Google Scholar
Digital Library
- Qi Liu, Yong Ge, Zhongmou Li, Enhong Chen, and Hui Xiong. 2011. Personalized travel package recommendation. In 2011 IEEE 11th International Conference on Data Mining (ICDM’11). IEEE, 407--416. Google Scholar
Digital Library
- Guangyi Lv, Tong Xu, Enhong Chen, Qi Liu, and Yi Zheng. 2016. Reading the videos: Temporal labeling for crowdsourced time-sync videos based on semantic embedding. In Proceedings of the 30th AAAI Conference on Artificial Intelligence. Google Scholar
Digital Library
- Naohiro Matsumura, Yukio Ohsawa, and Mitsuru Ishizuka. 2002. Mining and characterizing opinion leaders from threaded online discussions. In Proceedings of the 6th International Conference on Knowledge-Based Intelligent Engineering Systems 8 Allied Technologies. 1267--1270.Google Scholar
- Seungwhan Moon, Saloni Potdar, and Lara Martin. 2014. Identifying student leaders from MOOC discussion forums through language influence. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 15--20.Google Scholar
Cross Ref
- K. G. M. Moons, A. Rogier T. Donders, E. W. Steyerberg, and F. E. Harrell. 2004. Penalized maximum likelihood estimation to directly adjust diagnostic and prognostic prediction models for overoptimism: A clinical example. J. Clin. Epidemiol. 57, 12 (2004), 1262--1270.Google Scholar
Cross Ref
- Júnia Ortiz, Solange Hamrin, Camilo Aggio, and Edson Dalmonte. 2015. Television experience and political discussion on Twitter: Exploring online conversations during the 2014 Brazilian presidential elections. (2015).Google Scholar
- G. Palshikar and others. 2009. Simple algorithms for peak detection in time-series. In Proceedings of the 1st International Conference on Advanced Data Analysis, Business Analytics and Intelligence.Google Scholar
- A. Sboner, A. Romanel, A. Malossini, F. Ciocchetta, F. Demichelis, I. Azzini, E. Blanzieri, and R. Dell Anna. 2007. Simple methods for peak and valley detection in time series microarray data. In Methods of Microarray Data Analysis V. Springer, 27--44.Google Scholar
- Xiaodan Song, Yun Chi, Koji Hino, and Belle Tseng. 2007. Identifying opinion leaders in the blogosphere. In Proceedings of the 16th ACM Conference on Conference on Information and Knowledge Management. ACM, 971--974. Google Scholar
Digital Library
- Gabor Szabo and Bernardo A. Huberman. 2010. Predicting the popularity of online content. Commun. ACM 53, 8 (2010), 80--88. Google Scholar
Digital Library
- Damian Trilling. 2015. Two different debates? Investigating the relationship between a political debate on TV and simultaneous comments on Twitter. Soc. Sci. Comput. Rev. 33, 3 (2015), 259--276. Google Scholar
Digital Library
- Michail Vlachos, Christopher Meek, Zografoula Vagena, and Dimitrios Gunopulos. 2004. Identifying similarities, periodicities and bursts for online search queries. In Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data. ACM, 131--142. Google Scholar
Digital Library
- Ting Wang, Dashun Wang, and Fei Wang. 2014. Quantifying herding effects in crowd wisdom. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1087--1096. Google Scholar
Digital Library
- Ivo Welch. 2000. Herding among security analysts. J. Financ. Econom. 58, 3 (2000), 369--396.Google Scholar
Cross Ref
- Bin Wu, Erheng Zhong, Ben Tan, Andrew Horner, and Qiang Yang. 2014. Crowdsourced time-sync video tagging using temporal and personalized topic modeling. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 721--730. Google Scholar
Digital Library
- Le Wu, Yong Ge, Qi Liu, Enhong Chen, Richang Hong, Junping Du, and Meng Wang. 2017. Modeling the evolution of users preferences and social links in social networking services. IEEE Trans. Knowl. Data Eng. 29, 6 (2017), 1240--1253. Google Scholar
Digital Library
- Zechen Wu and Eisuke Ito. 2014. Correlation analysis between user’s emotional comments and popularity measures. In Proceedings of the 2014 IIAI 3rd International Conference on Advanced Applied Informatics (IIAIAAI’14). IEEE, 280--283.Google Scholar
Cross Ref
- Xiao Yu, Xu Wei, and Xia Lin. 2010. Algorithms of BBS opinion leader mining based on sentiment analysis. In Web Information Systems and Mining. Springer, 360--369. Google Scholar
Digital Library
- Stelios H. Zanakis, Anthony Solomon, Nicole Wishart, and Sandipa Dublish. 1998. Multi-attribute decision making: A simulation comparison of select methods. Eur. J. Operat. Res. 107, 3 (1998), 507--529.Google Scholar
Cross Ref
- Zhou Zhao, Hanqing Lu, Deng Cai, Xiaofei He, and Yueting Zhuang. 2016. User preference learning for online social recommendation. IEEE Trans. Knowl. Data Eng. 28, 9 (2016), 2522--2534. Google Scholar
Digital Library
- Ke Zhou, Hongyuan Zha, and Le Song. 2013. Learning triggering kernels for multi-dimensional Hawkes processes. In Proceedings of the 30th International Conference on Machine Learning (ICML’13). 1301--1309. Google Scholar
Digital Library
- Yunyue Zhu and Dennis Shasha. 2003. Efficient elastic burst detection in data streams. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 336--345. Google Scholar
Digital Library
Index Terms
Exploring the Emerging Type of Comment for Online Videos: DanMu
Recommendations
Discovering Emerging Topics in Social Streams via Link-Anomaly Detection
Detection of emerging topics is now receiving renewed interest motivated by the rapid growth of social networks. Conventional-term-frequency-based approaches may not be appropriate in this context, because the information exchanged in social-network ...
Live Semantic Sport Highlight Detection Based on Analyzing Tweets of Twitter
ICME '12: Proceedings of the 2012 IEEE International Conference on Multimedia and ExpoMicroblogging as a new form of communication on Internet, has attracted the attention from researchers recently. Relying the real-time and conversational properties of microblogging, its users update their statuses and share experience within their the ...
Detecting Location-Based Enumerating Bursts in Georeferenced Micro-Posts
IIAI-AAI '13: Proceedings of the 2013 Second IIAI International Conference on Advanced Applied InformaticsNowadays, a large number of georeferenced micro-posts, i.e., short messages including location information, are posted on social media sites. People transmit and collect information over the Internet through these georeferenced micro-posts, which are ...






Comments