Abstract
It is often difficult to separate the highly capable “experts” from the average worker in crowdsourced systems. This is especially true for challenge application domains that require extensive domain knowledge. The problem of stock analysis is one such domain, where even the highly paid, well-educated domain experts are prone to make mistakes. As an extremely challenging problem space, the “wisdom of the crowds” property that many crowdsourced applications rely on may not hold.
In this article, we study the problem of evaluating and identifying experts in the context of SeekingAlpha and StockTwits, two crowdsourced investment services that have recently begun to encroach on a space dominated for decades by large investment banks. We seek to understand the quality and impact of content on collaborative investment platforms, by empirically analyzing complete datasets of SeekingAlpha articles (9 years) and StockTwits messages (4 years). We develop sentiment analysis tools and correlate contributed content to the historical performance of relevant stocks. While SeekingAlpha articles and StockTwits messages provide minimal correlation to stock performance in aggregate, a subset of experts contribute more valuable (predictive) content. We show that these authors can be easily identified by user interactions, and investments based on their analysis significantly outperform broader markets. This effectively shows that even in challenging application domains, there is a secondary or indirect wisdom of the crowds.
Finally, we conduct a user survey that sheds light on users’ views of SeekingAlpha content and stock manipulation. We also devote efforts to identify potential manipulation of stocks by detecting authors controlling multiple identities.
- Ahmed Abbasi and Hsinchun Chen. 2008. Writeprints: A stylometric approach to identity-level identification and similarity detection in cyberspace. ACM Transactions on Information Systems (TOIS) 26, 2 (2008), 7. Google Scholar
Digital Library
- Lada A. Adamic, Jun Zhang, Eytan Bakshy, and Mark S. Ackerman. 2008. Knowledge sharing and yahoo answers: Everyone knows something. In Proc. of World Wide Web (WWW). Google Scholar
Digital Library
- Sadia Afroz, Michael Brennan, and Rachel Greenstadt. 2012. Detecting hoaxes, frauds, and deception in writing style online. In Proc. of IEEE S8P. Google Scholar
Digital Library
- Leman Akoglu, Rishi Chandy, and Christos Faloutsos. 2013. Opinion fraud detection in online reviews by network effects. In Proc. of ICWSM.Google Scholar
- Roy Bar-Haim, Elad Dinur, Ronen Feldman, Moshe Fresko, and Guy Goldstein. 2011. Identifying and following expert investors in stock microblogs. In Proc. of EMNLP. Google Scholar
Digital Library
- Luciano Barbosa and Junlan Feng. 2010. Robust sentiment detection on Twitter from biased and noisy data. In Proc. of COLING. Google Scholar
Digital Library
- Fabrício Benevenuto, Gabriel Magno, Tiago Rodrigues, and Virgilio Almeida. 2010. Detecting spammers on Twitter. In Proc. of CEAS.Google Scholar
- Mudit Bhargava, Pulkit Mehndiratta, and Krishna Asawa. 2013. Stylometric analysis for authorship attribution on Twitter. In Proc. of International Conference on Big Data Analytics. 37--47. Google Scholar
Digital Library
- Johan Bollen, Huina Mao, and Xiaojun Zeng. 2011. Twitter mood predicts the stock market. Journal of Computational Science 2, 1 (2011), 1--8.Google Scholar
- Eric D. Brown. 2012. Will Twitter make you a better investor? A look at sentiment, user reputation and their effect on the stock market. In Proc. of SAIS.Google Scholar
- Hailiang Chen, Prabuddha De, J. Hu, and Byoung-Hyoun Hwang. 2014. Wisdom of crowds: The value of stock opinions transmitted through social media. Review of Financial Studies 27, 5 (2014), 1367--1403.Google Scholar
Cross Ref
- Munmun De Choudhury and others. 2008. Can blog communication dynamics be correlated with stock market activity? In Proc. of HyperText. Google Scholar
Digital Library
- Douglas W. Diamond and Robert E. Verrecchia. 1987. Constraints on short-selling and asset price adjustment to private information. Journal of Financial Economics 18, 2 (1987), 277--311.Google Scholar
Cross Ref
- Andrea Esuli and Fabrizio Sebastiani. 2007. Pageranking wordnet synsets: An application to opinion mining. In Proc. of ACL.Google Scholar
- Adam Feuerstein. 2014. Galena Biopharma Pays For Stock-Touting Campaign While Insiders Cash Out Millions. TheStreet News. (February 2014).Google Scholar
- Clifton Forlines, Sarah Miller, Leslie Guelcher, and Robert Bruzzi. 2014. Crowdsourcing the future: Predictions made with a social network. In Proc. of CHI. Google Scholar
Digital Library
- Gabriel Pui Cheong Fung, Jeffrey Xu Yu, and Wai Lam. 2003. Stock prediction: Integrating text mining approach using real-time news. In Proc. of CIFER.Google Scholar
Cross Ref
- Hongyu Gao and others. 2010. Detecting and characterizing social spam campaigns. In Proc. of IMC. Google Scholar
Digital Library
- Mikros K. George and Eleni K. Argiri. 2007. Investigating topic influence in authorship attribution. In Proc. of PAN.Google Scholar
- Eric Gilbert and Karrie Karahalios. 2010. Widespread worry and the stock market. In Proc. of ICWSM.Google Scholar
- Namrata Godbole, Manja Srinivasaiah, and Steven Skiena. 2007. Large-scale sentiment analysis for news and blogs. In Proc. of ICWSM.Google Scholar
- Pollyanna Gonçalves, Matheus Araújo, Fabrício Benevenuto, and Meeyoung Cha. 2013. Comparing and combining sentiment analysis methods. In Proc. of COSN. Google Scholar
Digital Library
- F. Maxwell Harper, Daphne Raban, Sheizaf Rafaeli, and Joseph A. Konstan. 2008. Predictors of answer quality in online Q8A sites. In Proc. of CHI. Google Scholar
Digital Library
- Investment 2013. 2013 Investment Company Fact Book. Technical Report. Investment Company Institute. Retrieved from http://www.ici.org/pdf/2013_factbook.pdf.Google Scholar
- Michal Jacovi, Ido Guy, Shiri Kremer-Davidson, Sara Porat, and Netta Aizenbud-Reshef. 2014. The perception of others: Inferring reputation from social media in the enterprise. In Proc. of CSCW. Google Scholar
Digital Library
- Yigitcan Karabulut. 2011. Can Facebook predict stock market activity? SSRN eLibrary (2011).Google Scholar
- Joy Kim, Justin Cheng, and Michael S Bernstein. 2014. Ensemble: Exploring complementary strengths of leaders and crowds in creative collaboration. In Proc. of CSCW. Google Scholar
Digital Library
- John Kimelman. 2014. An insider’s tale of a stock promotion plan. Barrons News. (March 2014).Google Scholar
- Saijel Kishan and Kelly Bit. 2013. Hedge funds trail stocks by the widest margin since 2005. Bloomberg News. (December 2013).Google Scholar
- Aniket Kittur, Jeffrey V. Nickerson, Michael Bernstein, Elizabeth Gerber, Aaron Shaw, John Zimmerman, Matt Lease, and John Horton. 2013. The future of crowd work. In Proc. of CSCW. Google Scholar
Digital Library
- Dan Klein and Christopher D. Manning. 2003. Accurate unlexicalized parsing. In Proc. of ACL. Google Scholar
Digital Library
- Q. Vera Liao, Claudia Wagner, Peter Pirolli, and Wai-Tat Fu. 2012. Understanding experts’ and novices’ expertise judgment of Twitter users. In Proc. of CHI. Google Scholar
Digital Library
- Wenhui Liao, Sameena Shah, and Masoud Makrehchi. 2014. Winning by following the winners: Mining the behaviour of stock market experts in social media. In SBP. 103--110.Google Scholar
- Bing Liu. 2012. Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies 5, 1 (2012), 1--167. Google Scholar
Cross Ref
- Tim Loughran and Bill McDonald. 2011. When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. Journal of Finance 66, 1 (2011), 35--65.Google Scholar
- Masoud Makrehchi, Sameena Shah, and Wenhui Liao. 2013. Stock prediction using event-based sentiment analysis. In Proc. of WI-IAT. Google Scholar
Digital Library
- Helen Susannah Moat, Chester Curme, Adam Avakian, Dror Y. Kenett, H. Eugene Stanley, and Tobias Preis. 2013. Quantifying wikipedia usage patterns before stock market moves. Scientific Reports 3 (2013).Google Scholar
- Arjun Mukherjee, Abhinav Kumar, Bing Liu, Junhui Wang, Meichun Hsu, Malu Castellanos, and Riddhiman Ghosh. 2013. Spotting opinion spammers using behavioral footprints. In Proc. of SIGKDD. ACM, 632--640. Google Scholar
Digital Library
- Arvind Narayanan, Hristo Paskov, Neil Zhenqiang Gong, John Bethencourt, Emil Stefanov, Eui Chul Richard Shin, and Dawn Song. 2012. On the feasibility of internet-scale author identification. In Proc. of IEEE S8P. Google Scholar
Digital Library
- Jeffrey Nichols, Michelle Zhou, Huahai Yang, Jeon-Hyung Kang, and Xiao Hua Sun. 2013. Analyzing the quality of information solicited from targeted strangers on social media. In Proc. of CSCW. Google Scholar
Digital Library
- Chong Oh and Olivia Sheng. 2011. Investigating predictive power of stock micro blog sentiment in forecasting future stock price directional movement. In Proc. of ICIS.Google Scholar
- Nuno Oliveira, Paulo Cortez, and Nelson Areal. 2013. On the predictability of stock market behavior using stockTwits sentiment and posting volume. In Progress in AI. 355--365.Google Scholar
- Judith S. Olson and Wendy A. Kellogg. 2014. Ways of Knowing in HCI. Springer. Google Scholar
Digital Library
- Jahna Otterbacher. 2009. “Helpfulness” in online communities: A measure of message quality. In Proc. of CHI. Google Scholar
Digital Library
- Bo Pang and Lillian Lee. 2008. Opinion mining and sentiment analysis. Foundations and Trends® in Information Retrieval 2, 1--2 (2008), 1--135. Google Scholar
Digital Library
- Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up?: Sentiment classification using machine learning techniques. In Proc. of ACL. Google Scholar
Digital Library
- Karl Pearson. 1895. Contributions to the mathematical theory of evolution. II. Skew variation in homogeneous material. Philosophical Transactions of the Royal Society A 186 (1895), 343--414.Google Scholar
Cross Ref
- Richard Pearson. 2014. Behind the promotion of Northwest Bio. Seeking Alpha. (July 2014).Google Scholar
- Tobias Preis, Helen Susannah Moat, and H. Eugene Stanley. 2013. Quantifying trading behavior in financial markets using google trends. Scientific Reports 3 (2013).Google Scholar
- Tushar Rao and Saket Srivastava. 2012. Analyzing stock market movements using Twitter sentiment analysis. In Proc. of ASONAM. Google Scholar
Digital Library
- Robert P. Schumaker and Hsinchun Chen. 2009. Textual analysis of stock market prediction using breaking financial news: The AZFin text system. ACM Transactions on Information Systems (TOIS) 27, 2 (2009), 12. Google Scholar
Digital Library
- SeekingAlpha. 2014. About SeekingAlpha. http://seekingalpha.com/page/about_us.Google Scholar
- Victor S. Sheng, Foster Provost, and Panagiotis G. Ipeirotis. 2008. Get another label? Improving data quality and data mining using multiple, noisy labelers. In Proc. of KDD. 614--622. Google Scholar
Digital Library
- Dae-Neung Sohn, Jung-Tae Lee, and Hae-Chang Rim. 2009. The contribution of stylistic information to content-based mobile spam filtering. In Proc. of the ACL. 321--324. Google Scholar
Digital Library
- Timm O. Sprenger, Andranik Tumasjan, Philipp G. Sandner, and Isabell M. Welpe. 2013. Tweets and trades: The information content of stock microblogs. European Financial Management (2013).Google Scholar
- StockTwits. 2014. About StockTwits. http://stocktwits.com/about.Google Scholar
- Yu-An Sun and Christopher R. Dance. 2012. When majority voting fails: Comparing quality assurance methods for noisy human computation environment. In Proc. of Collective Intelligence.Google Scholar
- Yla R. Tausczik, Aniket Kittur, and Robert E. Kraut. 2014. Collaborative problem solving: A study of mathoverflow. In Proc. of CSCW. Google Scholar
Digital Library
- Dylan Tweney. 2013. Seeking Alpha: Who needs an acquisition when were doing so well? VentureBeat News. (October 2013).Google Scholar
- Bimal Viswanath and others. 2010. An analysis of social network-based sybil defenses. In Proc. of SIGCOMM. Google Scholar
Digital Library
- Alex Hai Wang. 2010. Don’t follow me: Spam detection in Twitter. In Proc. of SECRYPT.Google Scholar
- Gang Wang, Konark Gill, Manish Mohanlal, Haitao Zheng, and Ben Y. Zhao. 2013a. Wisdom in the social crowd: An analysis of quora. In Proc. of WWW. Google Scholar
Digital Library
- Gang Wang, Tristan Konolige, Christo Wilson, Xiao Wang, Haitao Zheng, and Ben Y. Zhao. 2013b. You are how you click: Clickstream analysis for sybil detection. In Proc. of USENIX Security. Google Scholar
Digital Library
- Gang Wang, Manish Mohanlal, Christo Wilson, Xiao Wang, Miriam Metzger, Haitao Zheng, and Ben Y. Zhao. 2013c. Social turing tests: Crowdsourcing sybil detection. In Proc. of NDSS.Google Scholar
Digital Library
- Gang Wang, Tianyi Wang, Bolun Wang, Divya Sambasivan, Zengbin Zhang, Haitao Zheng, and Ben Y. Zhao. 2015. Crowds on wall street: Extracting value from collaborative investing platforms. In Proc. of CSCW. Google Scholar
Digital Library
- Gang Wang, Tianyi Wang, Haitao Zheng, and Ben Y. Zhao. 2014. Man vs. machine: Practical adversarial detection of malicious crowdsourcing workers. In Proc. of USENIX Security. Google Scholar
Digital Library
- Gang Wang, Christo Wilson, Xiaohan Zhao, Yibo Zhu, Manish Mohanlal, Haitao Zheng, and Ben Y. Zhao. 2012. Serf and turf: Crowdturfing for fun and profit. In Proc. of WWW. Google Scholar
Digital Library
- Anbang Xu, Shih-Wen Huang, and Brian Bailey. 2014. Voyant: Generating structured feedback on visual designs using a crowd of non-experts. In Proc. of CSCW. Google Scholar
Digital Library
- Yahoo. 2014. Yahoo Finance API. https://code.google.com/p/yahoo-finance-managed/.Google Scholar
- Zhi Yang and others. 2011. Uncovering social network sybils in the wild. In Proc. of IMC. Google Scholar
Digital Library
- Haifeng Yu, Michael Kaminsky, Phillip B. Gibbons, and Abraham Flaxman. 2006. SybilGuard: Defending against sybil attacks via social networks. In Proc. of SIGCOMM. Google Scholar
Digital Library
- Rong Zheng, Jiexun Li, Hsinchun Chen, and Zan Huang. 2006. A framework for authorship identification of online messages: Writing-style features and classification techniques. Journal of the American Society for Information Science and Technology 57, 3 (2006), 378--393. Google Scholar
Digital Library
Index Terms
Value and Misinformation in Collaborative Investing Platforms
Recommendations
Crowds on Wall Street: Extracting Value from Collaborative Investing Platforms
CSCW '15: Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social ComputingIn crowdsourced systems, it is often difficult to separate the highly capable "experts" from the average worker. In this paper, we study the problem of evaluating and identifying experts in the context of SeekingAlpha and StockTwits, two crowdsourced ...
Dynamic measurement of the liquidity level of the stock market based on the LA-CAPM model
In the trading activities, the liquidity level of assets and its impact on future earnings has always been a concern for investors. This paper calculates the liquidity of the stock market by constructing a frictionless asset model. At the same time, we ...
Stock Market, Exchange Rate and Chinese Money Demand
ISME '10: Proceedings of the 2010 International Conference of Information Science and Management Engineering - Volume 02The paper examines the long-term relationship among RMB exchange rate, stock market, interest rate, consumption, general real money balance and their dynamics from 2000 to 2009 by employing Johanson and SVAR methods. The result shows: exchange rate and ...








Comments