Abstract
Microblogs are increasingly exploited for predicting prices and traded volumes of stocks in financial markets. However, it has been demonstrated that much of the content shared in microblogging platforms is created and publicized by bots and spammers. Yet, the presence (or lack thereof) and the impact of fake stock microblogs has never been systematically investigated before. Here, we study 9M tweets related to stocks of the five main financial markets in the US. By comparing tweets with financial data from Google Finance, we highlight important characteristics of Twitter stock microblogs. More importantly, we uncover a malicious practice—referred to as cashtag piggybacking—perpetrated by coordinated groups of bots and likely aimed at promoting low-value stocks by exploiting the popularity of high-value ones. Among the findings of our study is that as much as 71% of the authors of suspicious financial tweets are classified as bots by a state-of-the-art spambot-detection algorithm. Furthermore, 37% of them were suspended by Twitter a few months after our investigation. Our results call for the adoption of spam- and bot-detection techniques in all studies and applications that exploit user-generated content for predicting the stock market.
Supplemental Material
Available for Download
Supplemental movie, appendix, image and software files for, Cashtag Piggybacking: Uncovering Spam and Bot Activity in Stock Microblogs on Twitter
- Luca Maria Aiello, Martina Deplano, Rossano Schifanella, and Giancarlo Ruffo. 2012. People are strange when you’re a stranger: Impact and influence of bots on social networks. In Proceedings of the 6th International Conference on Web and Social Media (ICWSM’12). AAAI.Google Scholar
- Leman Akoglu, Mary McGlohon, and Christos Faloutsos. 2010. Oddball: Spotting anomalies in weighted graphs. In Proceedings of the 14th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’10). Springer, 410--421. Google Scholar
Digital Library
- Jon-Patrick Allem and Emilio Ferrara. 2018. Could social bots pose a threat to public health? Amer. J. Pub. Health 108, 8 (2018), 1005.Google Scholar
Cross Ref
- Omar Alonso and Kartikay Khandelwal. 2014. Kondenzer: Exploration and visualization of archived social media. In Proceedings of the 30th International Conference on Data Engineering (ICDE’14). IEEE, 1202--1205.Google Scholar
Cross Ref
- Jalal S. Alowibdi, Ugo A. Buy, S. Yu Philip, and Leon Stenneth. 2014. Detecting deception in online social networks. In Proceedings of the 6th International Conference on Advances in Social Networks Analysis and Mining (ASONAM’14). IEEE/ACM, 383--390. Google Scholar
Digital Library
- Marco Avvenuti, Stefano Cresci, Mariantonietta N. La Polla, Carlo Meletti, and Maurizio Tesconi. 2017. Nowcasting of earthquake consequences using big social data. IEEE Internet Comput. 21, 6 (2017), 37--45.Google Scholar
Cross Ref
- Satya Badri, Kyumin Lee, Dongwon Lee, Thanh Tran, and Jason Jiasheng Zhang. 2016. Uncovering fake likers in online social networks. In Proceedings of the 25th International on Conference on Information and Knowledge Management (CIKM’16). ACM, 2365--2370. Google Scholar
Digital Library
- Marco T. Bastos and Dan Mercea. 2017. The Brexit botnet and user-generated hyperpartisan news. Soc. Sci. Comput. Rev. (2017), 0894439317734157.Google Scholar
- Alessandro Bessi and Emilio Ferrara. 2016. Social bots distort the 2016 US presidential election online discussion. First Mon. 21, 11 (2016).Google Scholar
- Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, and Christos Faloutsos. 2013. Copycatch: Stopping group attacks by spotting lockstep behavior in social networks. In Proceedings of the 22nd International Conference on World Wide Web (WWW’13). ACM, 119--130. Google Scholar
Digital Library
- Johan Bollen, Huina Mao, and Alberto Pepe. 2011. Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena. In Proceedings of the 5th International Conference on Web and Social Media (ICWSM’11). AAAI, 450--453.Google Scholar
- Johan Bollen, Huina Mao, and Xiaojun Zeng. 2011. Twitter mood predicts the stock market. J. Comput. Sci. 2, 1 (2011), 1--8.Google Scholar
Cross Ref
- Florian Brachten, Stefan Stieglitz, Lennart Hofeditz, Katharina Kloppenborg, and Annette Reimann. 2017. Strategies and influence of social bots in a 2017 German state election—A case study on Twitter. In Proceedings of the 28th Australasian Conference on Information Systems (ACIS’17).Google Scholar
- David A. Broniatowski, Amelia M. Jamison, SiHua Qi, Lulwah AlKulaib, Tao Chen, Adrian Benton, Sandra C. Quinn, and Mark Dredze. 2018. Weaponized health communication: Twitter bots and Russian trolls amplify the vaccine debate. Amer. J. Pub. Health 108, 10 (2018), 1378--1384.Google Scholar
Cross Ref
- Armir Bujari, Marco Furini, and Nicolas Laina. 2017. On using cashtags to predict companies’ stock trends. In Proceedings of the 14th IEEE Annual Consumer Communications 8 Networking Conference (CCNC’17). IEEE, 25--28.Google Scholar
Cross Ref
- Lorenzo Cazzoli, Rajesh Sharma, Michele Treccani, and Fabrizio Lillo. 2016. A large-scale study to understand the relation between Twitter and financial market. In Proceedings of the 3rd European Network Intelligence Conference (ENIC’16). IEEE, 98--105.Google Scholar
Cross Ref
- Diego Ceccarelli, Francesco Nidito, and Miles Osborne. 2016. Ranking financial tweets. In Proceedings of the 39th International Conference on Research and Development in Information Retrieval (SIGIR’16). ACM, 527--528. Google Scholar
Digital Library
- Nikan Chavoshi, Hossein Hamooni, and Abdullah Mueen. 2016. DeBot: Twitter bot detection via warped correlation. In Proceedings of the 16th International Conference on Data Mining (ICDM’16). IEEE, 817--822.Google Scholar
Cross Ref
- Nikan Chavoshi, Hossein Hamooni, and Abdullah Mueen. 2016. Identifying correlated bots in Twitter. In Proceedings of the 8th International Conference on Social Informatics (SocInfo’16). Springer, 14--21.Google Scholar
Cross Ref
- Nikan Chavoshi, Hossein Hamooni, and Abdullah Mueen. 2017. On-demand bot detection and archival system. In Proceedings of the 26th International Conference on World Wide Web Companion (WWW’17 Companion). ACM, 183--187. Google Scholar
Digital Library
- Hailiang Chen, Prabuddha De, Yu Jeffrey Hu, and Byoung-Hyoun Hwang. 2014. Wisdom of crowds: The value of stock opinions transmitted through social media. Rev. Financ. Studies 27, 5 (2014), 1367--1403.Google Scholar
Cross Ref
- Eric M. Clark, Chris A. Jones, Jake Ryland Williams, Allison N. Kurti, Mitchell Craig Norotsky, Christopher M. Danforth, and Peter Sheridan Dodds. 2016. Vaporous marketing: Uncovering pervasive electronic cigarette advertisements on Twitter. PLoS One 11, 7 (2016), e0157304.Google Scholar
Cross Ref
- Keith Cortis, André Freitas, Tobias Daudert, Manuela Huerlimann, Manel Zarrouk, Siegfried Handschuh, and Brian Davis. 2017. Semeval-2017 task 5: Fine-grained sentiment analysis on financial microblogs and news. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval’17). 519--535.Google Scholar
Cross Ref
- Stefano Cresci. 2018. Harnessing the social sensing revolution: Challenges and opportunities. Ph.D. dissertation. University of Pisa, Pisa, Italy.Google Scholar
- Stefano Cresci, Roberto Di Pietro, Marinella Petrocchi, Angelo Spognardi, and Maurizio Tesconi. 2015. Fame for sale: Efficient detection of fake Twitter followers. Dec. Supp. Systems 80 (2015), 56--71. Google Scholar
Digital Library
- Stefano Cresci, Roberto Di Pietro, Marinella Petrocchi, Angelo Spognardi, and Maurizio Tesconi. 2016. DNA-inspired online behavioral modeling and its application to spambot detection. IEEE Intell. Systems 31, 5 (2016), 58--64.Google Scholar
Cross Ref
- Stefano Cresci, Roberto Di Pietro, Marinella Petrocchi, Angelo Spognardi, and Maurizio Tesconi. 2017. Exploiting digital DNA for the analysis of similarities in Twitter behaviours. In Proceedings of the 4th IEEE International Conference on Data Science and Advanced Analytics (DSAA’17). IEEE, 686--695.Google Scholar
Cross Ref
- Stefano Cresci, Roberto Di Pietro, Marinella Petrocchi, Angelo Spognardi, and Maurizio Tesconi. 2017. The paradigm-shift of social spambots: Evidence, theories, and tools for the arms race. In Proceedings of the 26th International Conference on World Wide Web Companion (WWW’17 Companion). ACM, 963--972. Google Scholar
Digital Library
- Stefano Cresci, Roberto Di Pietro, Marinella Petrocchi, Angelo Spognardi, and Maurizio Tesconi. 2018. Social fingerprinting: Detection of spambot groups through DNA-inspired behavioral modeling. IEEE Trans. Depend. Secure Comput. 15, 4 (2018), 561--576.Google Scholar
- Stefano Cresci, Fabrizio Lillo, Daniele Regoli, Serena Tardelli, and Maurizio Tesconi. 2018. $FAKE: Evidence of spam and bot activity in stock microblogs on Twitter. In Proceedings of the 12th International Conference on Web and Social Media (ICWSM’18). AAAI, 580--583.Google Scholar
- Stefano Cresci, Salvatore Minutoli, Leonardo Nizzoli, Serena Tardelli, and Maurizio Tesconi. 2019. Enriching digital libraries with crowdsensed data. In Proceedings of the 15th Italian Research Conference on Digital Libraries (IRCDL’19). Springer, 144--158.Google Scholar
Cross Ref
- Stefano Cresci, Marinella Petrocchi, Angelo Spognardi, and Stefano Tognazzi. 2019. On the capability of evolved spambots to evade detection via genetic engineering. Online Soc. Netw. Media 9 (2019), 1--16.Google Scholar
Cross Ref
- Ronen Feldman. 2013. Techniques and applications for sentiment analysis. Commun. ACM 56, 4 (2013), 82--89. Google Scholar
Digital Library
- Emilio Ferrara. 2015. Manipulation and abuse on social media. ACM SIGWEB Newslett. Spring (2015), 4. Google Scholar
Digital Library
- Emilio Ferrara. 2017. Disinformation and social bot operations in the run-up to the 2017 French presidential election. First Mon. 22, 8 (2017).Google Scholar
- Emilio Ferrara, Onur Varol, Clayton Davis, Filippo Menczer, and Alessandro Flammini. 2016. The rise of social bots. Commun. ACM 59, 7 (2016), 96--104. Google Scholar
Digital Library
- Emilio Ferrara, Onur Varol, Filippo Menczer, and Alessandro Flammini. 2016. Detection of promoted social media campaigns. In Proceedings of the 10th International Conference on Web and Social Media (ICWSM’16). AAAI, 563--566.Google Scholar
- Peter Gabrovšek, Darko Aleksovski, Igor Mozetič, and Miha Grčar. 2017. Twitter sentiment around the earnings announcement events. PloS One 12, 2 (2017), e0173151.Google Scholar
Cross Ref
- Saptarshi Ghosh, Bimal Viswanath, Farshad Kooti, Naveen Kumar Sharma, Gautam Korlam, Fabricio Benevenuto, Niloy Ganguly, and Krishna Phani Gummadi. 2012. Understanding and combating link farming in the Twitter social network. In Proceedings of the 21st International Conference on World Wide Web (WWW’12). ACM, 61--70. Google Scholar
Digital Library
- Maria Giatsoglou, Despoina Chatzakou, Neil Shah, Christos Faloutsos, and Athena Vakali. 2015. Retweeting activity on Twitter: Signs of deception. In Proceedings of the 19th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’15). Springer, 122--134.Google Scholar
Cross Ref
- Zafar Gilani, Reza Farahbakhsh, and Jon Crowcroft. 2017. Do bots impact Twitter activity? In Proceedings of the 26th International Conference on World Wide Web Companion (WWW’17 Companion). ACM, 781--782. Google Scholar
Digital Library
- Eric Gilbert and Karrie Karahalios. 2010. Widespread worry and the stock market. In Proceedings of the 4th International Conference on Web and Social Media (ICWSM’10). AAAI, 59--65.Google Scholar
- Martin Hentschel and Omar Alonso. 2014. Follow the money: A study of cashtags on Twitter. First Mon. 19, 8 (2014).Google Scholar
- Lu Hong and Scott E. Page. 2004. Groups of diverse problem-solvers can outperform groups of high-ability problem-solvers. In Proc. Natl. Acad. Sci. USA 101, 46 (2004), 16385--16389.Google Scholar
Cross Ref
- Tim Hwang, Ian Pearce, and Max Nanis. 2012. Socialbots: Voices from the fronts. Interactions 19, 2 (2012), 38--45. Google Scholar
Digital Library
- Meng Jiang, Alex Beutel, Peng Cui, Bryan Hooi, Shiqiang Yang, and Christos Faloutsos. 2016. Spotting suspicious behaviors in multimodal data: A general metric and algorithms. IEEE Trans. Knowl. Data Eng. 28, 8 (2016), 2187--2200.Google Scholar
Digital Library
- Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2014. Inferring strange behavior from connectivity pattern in social networks. In Proceedings of the 18th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’14). Springer, 126--138.Google Scholar
Cross Ref
- Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2016. Catching synchronized behaviors in large networks: A graph mining approach. ACM Trans. Knowl. Discov. Data 10, 4 (2016), 35. Google Scholar
Digital Library
- Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2016. Inferring lockstep behavior from connectivity pattern in large graphs. Knowl. Inform. Systems 48, 2 (2016), 399--428. Google Scholar
Digital Library
- Márton Karsai, Kimmo Kaski, Albert-László Barabási, and János Kertész. 2012. Universal features of correlated bursty behaviour. Sci. Rep. 2 (2012), 397.Google Scholar
- Milad Kharratzadeh and Mark Coates. 2012. Weblog analysis for predicting correlations in stock price evolutions. In Proceedings of the 6th International Conference on Web and Social Media (ICWSM’12). AAAI.Google Scholar
- Kai Kupferschmidt. 2017. Bot-hunters eye mischief in German election. Science 357, 6356 (2017), 1081--1082.Google Scholar
- Sangho Lee and Jong Kim. 2013. Warningbird: A near real-time detection system for suspicious URLs in Twitter stream. IEEE Trans. Depend. Secure Comput. 10, 3 (2013), 183--195. Google Scholar
Digital Library
- Sangho Lee and Jong Kim. 2014. Early filtering of ephemeral malicious accounts on Twitter. Comput. Comm. 54 (2014), 48--57. Google Scholar
Digital Library
- Quanzhi Li and Sameena Shah. 2017. Learning stock-market sentiment lexicon and sentiment-oriented word vector from StockTwits. In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL’17). 301--310.Google Scholar
Cross Ref
- Shenghua Liu, Bryan Hooi, and Christos Faloutsos. 2017. HoloScope: Topology-and-spike aware fraud detection. In Proceedings of the 2017 ACM Conference on Information and Knowledge Management (CIKM’17). ACM, 1539--1548. Google Scholar
Digital Library
- Xueming Luo and Jie Zhang. 2013. How do consumer buzz and traffic in social media marketing predict the value of the firm? J. Manag. Inform. Systems 30, 2 (2013), 213--238.Google Scholar
Cross Ref
- Xueming Luo, Jie Zhang, and Wenjing Duan. 2013. Social media and firm equity value. Inform. Systems Res. 24, 1 (2013), 146--163. Google Scholar
Digital Library
- Yuexin Mao, Wei Wei, Bing Wang, and Benyuan Liu. 2012. Correlating S8P 500 stocks with Twitter data. In Proceedings of the 1st International Workshop on Hot Topics on Interdisciplinary Social Networks Research (SIGKDD’12 Workshops). ACM, 69--72. Google Scholar
Digital Library
- Symeon Papadopoulos, Kalina Bontcheva, Eva Jaho, Mihai Lupu, and Carlos Castillo. 2016. Overview of the special issue on trust and veracity of information in social media. ACM Trans. Inform. Systems 34, 3 (2016), 14. Google Scholar
Digital Library
- Neeraj Rajesh and Lisa Gandy. 2016. CashTagNN: Using sentiment of tweets with CashTags to predict stock-market prices. In Proceedings of the 11th International Conference on Intelligent Systems: Theories and Applications (SITA’16). IEEE, 1--4.Google Scholar
Cross Ref
- Gabriele Ranco, Darko Aleksovski, Guido Caldarelli, Miha Grčar, and Igor Mozetič. 2015. The effects of Twitter sentiment on stock price returns. PloS One 10, 9 (2015), e0138441.Google Scholar
Cross Ref
- Jacob Ratkiewicz, Michael Conover, Mark R. Meiss, Bruno Gonçalves, Alessandro Flammini, and Filippo Menczer. 2011. Detecting and tracking political abuse in social media. In Proceedings of the 5th International Conference on Web and Social Media (ICWSM’11). AAAI, 297--304.Google Scholar
- Eduardo J. Ruiz, Vagelis Hristidis, Carlos Castillo, Aristides Gionis, and Alejandro Jaimes. 2012. Correlating financial time series with micro-blogging activity. In Proceedings of the 5th International Conference on Web Search and Data Mining (WSDM’12). ACM, 513--522. Google Scholar
Digital Library
- Fabian Schäfer, Stefan Evert, and Philipp Heinrich. 2017. Japan’s 2014 general election: Political bots, right-wing internet activism, and prime minister Shinzō Abe’s hidden nationalist agenda. Big Data 5, 4 (2017), 294--309.Google Scholar
Cross Ref
- Harald Schoen, Daniel Gayo-Avello, Panagiotis Takis Metaxas, Eni Mustafaraj, Markus Strohmaier, and Peter Gloor. 2013. The power of prediction with social media. Internet Res. 23, 5 (2013), 528--543.Google Scholar
Cross Ref
- Chengcheng Shao, Giovanni Luca Ciampaglia, Alessandro Flammini, and Filippo Menczer. 2016. Hoaxy: A platform for tracking online misinformation. In Proceedings of the 25th International Conference on World Wide Web Companion (WWW’16 Companion). ACM, 745--750. Google Scholar
Digital Library
- Jasmina Smailović, Miha Grčar, Nada Lavrač, and Martin Žnidaršič. 2014. Stream-based active learning for sentiment analysis in the financial domain. Inform. Sci. 285 (2014), 181--203. Google Scholar
Digital Library
- Timm Oliver Sprenger. 2011. TweetTrader.net: Leveraging crowd wisdom in a stock microblogging forum. In Proceedings of the 5th International Conference on Web and Social Media (ICWSM’11). AAAI.Google Scholar
- Andrew Tanenbaum and David Wetherall. 2014. Computer Networks. 5th Edition. Pearson Education Limited. Google Scholar
Digital Library
- Shiliang Tang, Qingyun Liu, Megan McQueen, Scott Counts, Apurv Jain, Heather Zheng, and Ben Y. Zhao. 2017. Echo chambers in investment discussion boards. In Proceedings of the 11th International Conference on Web and Social Media (ICWSM’17). AAAI.Google Scholar
- Stefano Tognazzi, Stefano Cresci, Marinella Petrocchi, and Angelo Spognardi. 2018. From reaction to proaction: Unexplored ways to the detection of evolving spambots. In Proceedings of the 27th Web Conference Companion (WWW’18 Companion). ACM, 1469--1470. Google Scholar
Digital Library
- Jan van der Tempel, Aliya Noormohamed, Robert Schwartz, Cameron Norman, Muhannad Malas, and Laurie Zawertailo. 2016. Vape, quit, tweet? Electronic cigarettes and smoking cessation on Twitter. Int. J. Pub. Health 61, 2 (2016), 249--256.Google Scholar
Cross Ref
- Onur Varol, Emilio Ferrara, Clayton A. Davis, Filippo Menczer, and Alessandro Flammini. 2017. Online human-bot interactions: Detection, estimation, and characterization. In Proceedings of the 11th International Conference on Web and Social Media (ICWSM’17). AAAI.Google Scholar
- Bimal Viswanath, Muhammad Ahmad Bashir, Muhammad Bilal Zafar, Simon Bouget, Saikat Guha, Krishna P. Gummadi, Aniket Kate, and Alan Mislove. 2015. Strength in numbers: Robust tamper detection in crowd computations. In Proceedings of the 2015 ACM Conference on Online Social Networks (COSN’15). ACM, 113--124. Google Scholar
Digital Library
- Tianyi Wang, Gang Wang, Bolun Wang, Divya Sambasivan, Zengbin Zhang, Xing Li, Haitao Zheng, and Ben Y. Zhao. 2017. Value and misinformation in collaborative investing platforms. ACM Trans. Web 11, 2 (2017), 8. Google Scholar
Digital Library
- Fangzhao Wu, Jinyun Shu, Yongfeng Huang, and Zhigang Yuan. 2015. Social spammer and spam message co-detection in microblogging with social context regularization. In Proceedings of the 24th International on Conference on Information and Knowledge Management (CIKM’15). ACM, 1601--1610. Google Scholar
Digital Library
- Chao Yang, Robert Harkreader, and Guofei Gu. 2013. Empirical evaluation and new design for fighting evolving Twitter spammers. IEEE Trans. Inform. Forens. Sec. 8, 8 (2013), 1280--1293. Google Scholar
Digital Library
- Haifeng Yu, Phillip B. Gibbons, Michael Kaminsky, and Feng Xiao. 2010. SybilLimit: A near-optimal social network defense against sybil attacks. IEEE/ACM Trans. Netw. 18, 3 (2010), 885--898. Google Scholar
Digital Library
- Rose Yu, Xinran He, and Yan Liu. 2015. GLAD: Group anomaly detection in social media analysis. ACM Trans. Knowl. Discov. Data 10, 2 (2015), 18. Google Scholar
Digital Library
- Yang Yu, Wenjing Duan, and Qing Cao. 2013. The impact of social and conventional media on firm equity value: A sentiment analysis approach. Dec. Supp. Systems 55, 4 (2013), 919--926. Google Scholar
Digital Library
- Xianchao Zhang, Zhaoxing Li, Shaoping Zhu, and Wenxin Liang. 2016. Detecting spam and promoting campaigns in Twitter. ACM Trans. Web 10, 1 (2016), 4. Google Scholar
Digital Library
- Ilya Zheludev, Robert Smith, and Tomaso Aste. 2014. When can social media lead financial markets? Sci. Rep. 4 (2014), 4213.Google Scholar
Index Terms
Cashtag Piggybacking: Uncovering Spam and Bot Activity in Stock Microblogs on Twitter
Recommendations
Detecting Spam and Promoting Campaigns in Twitter
Twitter has become a target platform for both promoters and spammers to disseminate their messages, which are more harmful than traditional spamming methods, such as email spamming. Recently, large amounts of campaigns that contain lots of spam or ...
Analyzing Stock Market Movements Using Twitter Sentiment Analysis
ASONAM '12: Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)In this paper we investigate the complex relationship between tweet board literature (like bullishness, volume, agreement etc) with the financial market instruments (like volatility, trading volume and stock prices). We have analyzed sentiments for more ...
Social Media-Based Forecasting: A Case Study of Tweets and Stock Prices in the Financial Services Industry
Social media-based forecasting has received significant attention from academia and industries in recent years. With a focus on Twitter, this paper investigates whether sentiments of the tweets regarding the 7 largest US financial service companies in ...






Comments