skip to main content
research-article

Detection of Political Manipulation in Online Communities through Measures of Effort and Collaboration

Published:12 June 2015Publication History
Skip Abstract Section

Abstract

Online social media allow users to interact with one another by sharing opinions, and these opinions have a critical impact on the way readers think and behave. Accordingly, an increasing number of <i>manipulators</i> deliberately spread messages to influence the public, often in an organized manner. In particular, political manipulation—manipulation of opponents to win political advantage—can result in serious consequences: antigovernment riots can break out, leading to candidates’ defeat in an election. A few approaches have been proposed to detect such manipulation based on the level of social interaction (i.e., manipulators actively post opinions but infrequently befriend and reply to other users). However, several studies have shown that the interactions can be forged at a low cost and thus may not be effective measures of manipulation.

To go one step further, we collect a dataset for real, large-scale political manipulation, which consists of opinions found on Internet forums. These opinions are divided into manipulators and nonmanipulators. Using this collection, we demonstrate that manipulators inevitably work hard, in teams, to quickly influence a large audience. With this in mind, it could be said that a high level of collaborative efforts strongly indicates manipulation. For example, a group of manipulators may jointly post numerous opinions with a consistent theme and selectively recommend the same, well-organized opinion to promote its rank. We show that the effort measures, when combined with a supervised learning algorithm, successfully identify greater than 95% of the manipulators. We believe that the proposed method will help system administrators to accurately detect manipulators in disguise, significantly decreasing the intensity of manipulation.

References

  1. Paul Adams. 2011. Grouped: How Small Groups of Friends Are the Key to Influence on the Social Web (Voices That Matter). New Riders, Berkeley, CA.Google ScholarGoogle Scholar
  2. Rakesh Agrawal and Ramakrishnan Srikant. 1994. Fast algorithms for mining association rules in large databases. In Proceedings of the 20th International Conference on Very Large Databases. 487--499. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Devdatta Akhawe and Adrienne Porter Felt. 2013. Alice in Warningland: A large-scale field study of browser security warning effectiveness. In Proceedings of the 22nd USENIX Conference on Security. 257--272. Available at https:&sol;&sol;www.usenix.org&sol;conference&sol;usenixsecurity13&sol;technical-sessions&sol;presentation&sol;akhawe. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Lorenzo Alvisi, Allen Clement, Alessandro Epasto, Silvio Lattanzi, and Alessandro Panconesi. 2013. SoK: The evolution of sybil defense via social networks. In Proceedings of the IEEE Symposium on Security and Privacy. 382--396. DOI:http:&sol;&sol;dx.doi.org&sol;10.1109&sol;SP.2013.33 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. BBC News Technology. 2011. Russian Twitter Political Protests ‘Swamped by Spam.’ Retrieved February 18, 2014, from http:&sol;&sol;www.bbc.co.uk&sol;news&sol;technology-16108876.Google ScholarGoogle Scholar
  6. Kyle Becker. 2012. The Handbook of Political Manipulation. Retrieved July 1, 2014, from http:&sol;&sol;www.conservativedailynews.com&sol;2012&sol;05&sol;the-handbook-of-political-manipulation-a-e-2&sol;.Google ScholarGoogle Scholar
  7. Fabricio Benevenuto, Gabriel Magno, Tiago Rodrigues, and Virgilio Almeida. 2010. Detecting spammers on Twitter. In Proceedings of the 7th Annual Collaboration, Electronic Messaging, Anti-Abuse, and Spam Conference. Available at http:&sol;&sol;ceas.cc&sol;2010&sol;papers&sol;Paper&percnt;2021.pdf.Google ScholarGoogle Scholar
  8. Fabricio Benevenuto, Tiago Rodrigues, Virgilio Almeida, Jussara Almeida, and Marcos Goncalves. 2009. Detecting spammers and content promoters in online video social networks. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 620--627. DOI:http:&sol;&sol;dx.doi.org&sol;10.1145&sol;1571941.1572047 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Robert M. Bond, Christopher J. Fariss, Jason J. Jones, Adam D. I. Kramer, Cameron Marlow, Jaime E. Settle, and James H. Fowler. 2012. A 61-million-person experiment in social influence and political mobilization. Nature 489,7415, 295--298. DOI:http:&sol;&sol;dx.doi.org&sol;10.1038&sol;nature11421Google ScholarGoogle Scholar
  10. Leo Breiman. 2001. Random forests. Springer Machine Learning 45, 1, 5--32. DOI:http:&sol;&sol;dx.doi.org&sol;10.1023&sol;A:1010933404324 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Rebecca Burns. 2012. Social Media Outranks TV as UK's Favourite Pastime. Retrieved July 1, 2014, from http:&sol;&sol;www.fourthsource.com&sol;news&sol;social-media-outranks-tv-as-uks-favourite-pastime-6218.Google ScholarGoogle Scholar
  12. Stephan Busemann, Sven Schmeier, and Roman G. Arens. 2000. Message classification in the call center. In Proceedings of the 6th Applied Natural Language Processing Conference. 158--165. DOI:http:&sol;&sol;dx.doi.org&sol;10.3115&sol;974147.974169 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Damon Centola. 2010. The spread of behavior in an online social network experiment. Science 329, 5996, 1194--1197. DOI:http:&sol;&sol;dx.doi.org&sol;10.1126&sol;science.1185231Google ScholarGoogle Scholar
  14. Chih-Chung Chang and Chih-Jen Lin. 2013. LIBSVM—A Library for Support Vector Machines (LIBSVM). Retrieved January 27, 2014 from http:&sol;&sol;www.csie.ntu.edu.tw&sol;&sim;cjlin&sol;libsvm&sol;.Google ScholarGoogle Scholar
  15. Sang-Hun Choe. 2013. Prosecutors Detail Attempt to Sway South Korean Election. Retrieved July 1, 2014, from http:&sol;&sol;www.nytimes.com&sol;2013&sol;11&sol;22&sol;world&sol;asia&sol;prosecutors-detail-bid-to-sway-south-korean-election.html?&lowbar;r&equals;0.Google ScholarGoogle Scholar
  16. Mark Clayton. 2013. In Cyber Arms Race, North Korea Emerging as a Power, Not a Pushover. Retrieved February 18, 2014, from http:&sol;&sol;www.csmonitor.com&sol;World&sol;Security-Watch&sol;2013&sol;1019&sol;In-cyberarms-race-North-Korea-emerging-as-a-power-not-a-pushover.Google ScholarGoogle Scholar
  17. Daum. 2014. Clean Center Policies. Retrieved July 1, 2014, from http:&sol;&sol;cs.daum.net&sol;faq&sol;site&sol;85.html.Google ScholarGoogle Scholar
  18. Georges Dupret and Masato Koda. 2001. Bootstrap re-sampling for unbalanced data in supervised learning. Elsevier European Journal of Operational Research 134, 1, 141--156. DOI:http:&sol;&sol;dx.doi.org&sol;10.1016&sol;S0377-2217(00)00244-7Google ScholarGoogle ScholarCross RefCross Ref
  19. Harry Fawcett. 2013. South Korea's Political Cyber War. Retrieved February 18, 2014, from http:&sol;&sol;blogs.aljazeera.com&sol;blog&sol;asia&sol;south-koreas-political-cyber-war.Google ScholarGoogle Scholar
  20. Usama M. Fayyad and Keki B. Irani. 1993. Multi-interval discretization of continuous-valued attributes for classification learning. In Proceedings of the International Joint Conferences on Artificial Intelligence. 1022--1027. Available at http:&sol;&sol;ijcai.org&sol;Past&percnt;20Proceedings&sol;IJCAI-93-VOL2&sol;PDF&sol;022.pdf.Google ScholarGoogle Scholar
  21. Joseph L. Fleiss. 1971. Measuring nominal scale agreement among many raters. Psychological Bulletin 76, 5, 378--382.Google ScholarGoogle ScholarCross RefCross Ref
  22. Matthew Fraser and Soumitra Dutta. 2008. Throwing Sheep in the Boardroom: How Online Social Networking Will Transform Your Life, Work and World. Wiley, Chichester, West Sussex, UK. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Yoav Freund and Robert E. Schapire. 1999. A short introduction to boosting. Journal of Japanese Society for Artificial Intelligence 14, 5, 1--14.Google ScholarGoogle Scholar
  24. Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen, and Ben. Y. Zhao. 2010. Detecting and characterizing social spam campaigns. In Proceedings of the 10th Internet Measurement Conference. ACM, New York, NY, 35--47.DOI:http:&sol;&sol;dx.doi.org&sol;10.1145&sol;1879141.1879147 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. R. Kelly Garrett and Brian E. Weeks. 2013. The promise and peril of real-time corrections to political misperceptions. In Proceedings of the 2013 Conference on Computer Supported Cooperative Work. ACM, New York, NY. 1047--1058. DOI:http:&sol;&sol;dx.doi.org&sol;10.1145&sol;2441776.2441895 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Amy Gesenhues. 2013. Survey: 90&percnt; of Customers Say Buying Decisions Are Influenced by Online Reviews. Retrieved July 1, 2014, from http:&sol;&sol;marketingland.com&sol;survey-customers-more-frustrated-by-how-long-it-takes-to-resolve-a-customer-service-issue-than-the-resolution-38756.Google ScholarGoogle Scholar
  27. Rumi Ghosh, Tawan Surachawala, and Kristina Lerman. 2011. Entropy-based classification of ‘Retweeting’ activity on Twitter. In Proceedings of the KDD Workshop on Social Network Analysis. ACM, New York. NY.Google ScholarGoogle Scholar
  28. Lee Howell. 2013. Digital wildfires in a hyperconnected world. In Global Risks 2013 (8th ed.). World Economic Forum. Available at http:&sol;&sol;reports.weforum.org&sol;global-risks-2013&sol;title-page&sol;.Google ScholarGoogle Scholar
  29. iNews. 2009. A Close Match between Nate and Duam in Weekly Visits. Retrieved July 1, 2014, from http:&sol;&sol;news.inews24.com&sol;php&sol;news&lowbar;view.php?g&lowbar;menu&equals;020300&g&lowbar;&lowbar;serial&equals;453619.Google ScholarGoogle Scholar
  30. Mahdi Jalili. 2012. Effects of leaders and social power on opinion formation in complex networks. Transactions of the Society for Modeling and Simulation International 89, 5, 578--588. DOI:http:&sol;&sol;dx.doi.org&sol;10.1177&sol;0037549712462621 Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Nitin Jindal and Bing Liu. 2008. Opinion spam and analysis. In Proceedings of the International Conference on Web Search and Data Mining. ACM, New York, NY, 219--230. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Anna Joy. 2013. Infographic: How South Korean Intelligence Interfered in Election. Retrieved July 1, 2014, from http:&sol;&sol;www.koreabang.com&sol;2013&sol;features&sol;infographic-how-south-korean-intelligence-manipulated-election.html.Google ScholarGoogle Scholar
  33. Arthur Jung. 2012. Evidence of Manipulating Recommendation Counts. Retrieved July 1, 2014, from http:&sol;&sol;arthurjung.tistory.com&sol;120.Google ScholarGoogle Scholar
  34. Hyun-Kyung Kang. 2013. Police Suspected of Destroying Evidence Involving NIS Probe. Retrieved November 4, 2014, from www.koreatimes.co.kr&sol;www&sol;news&sol;nation&sol;2013&sol;08&sol;116&lowbar;136430.html.Google ScholarGoogle Scholar
  35. Myungsook Klassen. 2013. Twitter data preprocessing for spam detection. In Proceedings of the 5th International Conference on Future Computational Technologies and Applications. 56--61.Google ScholarGoogle Scholar
  36. Ken Koo. 2013. Korean Spy Agency Accused of Influencing Presidential Election. Retrieved February 18, 2014, from http:&sol;&sol;www.koreabang.com&sol;2013&sol;stories&sol;korean-spy-agency-accused-of-influencing-presidential-election.html.Google ScholarGoogle Scholar
  37. Sejeong Kwon, Meeyoung Cha, Kyomin Jung, Wei Chen, and Yajun Wang. 2013. Prominent features of rumor propagation in online social media. In Proceedings of IEEE 13th International Conference on Data Mining. IEEE, Los Alamitos, CA, 1103--1108. DOI:http:&sol;&sol;dx.doi.org&sol;10.1109&sol;ICDM.2013.61Google ScholarGoogle ScholarCross RefCross Ref
  38. J. Richard Landis and Gary G. Koch. 1977. The measurement of observer agreement for categorical data. Biometrics 33, 159--174.Google ScholarGoogle Scholar
  39. Gyeduk Lee. 2013. Manipulation of Recommendation Counts Spotted by Netizen. Retrieved November 4, 2014, from http:&sol;&sol;www.pressbyple.com&sol;news&sol;articleView.html?idxno&equals;11841.Google ScholarGoogle Scholar
  40. Sihyung Lee. 2014. Popular List of Political-Campaign Words. Retrieved June 30, 2014, from https:&sol;&sol;sites.google.com&sol;site&sol;sihyungleeweb&sol;research&sol;political&lowbar;manipulation&sol;popular-list-of-political-campaign-words.Google ScholarGoogle Scholar
  41. Bing Liu. 2011. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data. Springer, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Media Today. 2013. Manipulation of Recommendation Counts by the Military and Government Agencies. Retrieved July 1, 2014, from http:&sol;&sol;www.mediatoday.co.kr&sol;news&sol;articleView.html?idxno&equals;112725.Google ScholarGoogle Scholar
  43. Delia Mocanu, Luca Rossi, Qian Zhang, Marton Karsai, and Walter Quattrociocchi. 2014. Collective Attention in the Age of Misinformation. Retrieved July 1, 2014, from http:&sol;&sol;arxiv.org&sol;abs&sol;1403.3344.Google ScholarGoogle Scholar
  44. Marti Motoyama, Damon McCoy, Kirill Levchenko, Stefan Savage, and Geoffrey M. Voelker. 2011. Dirty jobs: The role of freelance labor in Web service abuse. In Proceedings of the 20th USENIX Conference on Security. Available at https:&sol;&sol;www.usenix.org&sol;legacy&sol;events&sol;sec11&sol;tech&sol;full&lowbar;papers&sol;Motoyama.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Arjun Mukherjee, Bing Liu, and Natalie Glance. 2012. Spotting fake reviewer groups in consumer reviews. In Proceedings of the 21st International Conference on World Wide Web. ACM, New York, NY, 191--200. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Nate. 2014. Responsibility of Users for Their Postings. Retrieved July 1, 2014, from http:&sol;&sol;www.nate.com&sol;policy&sol;legal.html.Google ScholarGoogle Scholar
  47. Giang H. Nguyen, Abdesselam Bouzerdoum, and Son Lam Phung. 2008. A supervised learning approach for imbalanced data sets. In Proceedings of the 19th International Conference on Pattern Recognition. 1--4. DOI:http:&sol;&sol;dx.doi.org&sol;10.1109&sol;ICPR.2008.4761278Google ScholarGoogle ScholarCross RefCross Ref
  48. Harald Olsen. 2012. North Korea Weighs in on South Korean Presidential Election. Retrieved February 18, 2014, from http:&sol;&sol;www.koreabang.com&sol;2012&sol;stories&sol;north-korea-weighs-in-on-south-korean-presidential-election.html.Google ScholarGoogle Scholar
  49. Bo Pang and Lillian Lee. 2008. Opinion Mining and Sentiment Analysis. Now Publishers, Delft, Netherlands.Google ScholarGoogle Scholar
  50. Scott Rasmussen and Doug Schoen. 2010. Mad as Hell: How the Tea Party Movement Is Fundamentally Remaking Our Two-Party System. HarperCollins. New York, NY.Google ScholarGoogle Scholar
  51. Jacob Ratkiewicz, Michael Conover, Mark Meiss, Bruno Goncalves, Alessandro Flammini, and Filippo Menczer. 2011. Detecting and tracking political abuse in social media. In Proceedings of the 5th International Conference on Weblogs and Social Media. 297--304.Google ScholarGoogle Scholar
  52. William H. Riker. 1986. The Art of Political Manipulation. Yale University Press, New Haven, CT.Google ScholarGoogle Scholar
  53. Kyungmin Shin. 2013. NIS (National Intelligence Service). Medici Media, Seoul.Google ScholarGoogle Scholar
  54. Judith S. Trent, Robert V. Friedenberg, and Robert E. Denton Jr. 2011. Political Campaign Communication: Principles and Practices (7th ed.). Rowman &amp; Littlefield, Lanham, MD.Google ScholarGoogle Scholar
  55. Johan Ugander, Lars Backstrom, Cameron Marlow, and Jon Kleinberg. 2012. Structural diversity in social contagion. Proceedings of the National Academy of Sciences of the United States of America 109, 16, 5962--5966. DOI:http:&sol;&sol;dx.doi.org&sol;10.1073&sol;pnas.1116502109Google ScholarGoogle ScholarCross RefCross Ref
  56. Vladimir Vapnik. 2000. The Nature of Statistical Learning Theory (2nd ed.). Springer, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Gang Wang, Tristan Konolige, Christo Wilson, Xiao Wang, Haitao Zheng, and Ben Y. Zhao. 2013. You are how you click: Clickstream analysis for sybil detection. In Proceedings of the 22nd USENIX Conference on Security. 241--256. Available at https:&sol;&sol;www.usenix.org&sol;conference&sol;usenixsecurity13&sol;technical-sessions&sol;presentation&sol;wang. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Gang Wang, Christo Wilson, Xiaohan Zhao, Yibo Zhu, Manish Mohanlal, Haitao Zheng, and Ben Y. Zhao. 2012. Serf and turf: CrowdTurfing for fun and profit. In Proceedings of the 21st International Conference on World Wide Web. ACM, New York, NY, 679--688. DOI:http:&sol;&sol;dx.doi.org&sol;10.1145&sol;2187836.2187928 Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Weka. 2014. Weka 3: Data Mining Software in Java. Retrieved July 1, 2014, from http:&sol;&sol;www.cs.waikato.ac.nz&sol;ml&sol;weka&sol;index.html.Google ScholarGoogle Scholar
  60. Danielle R. Wiese. 2005. Campaign 2004: Developments in cyberpolitics. In The 2004 Presidential Campaign: A Communication Perspective, R. E. Denton (Ed.). Rowman &amp; Littlefield, Lanham, MD, 217--240.Google ScholarGoogle Scholar
  61. Chang Xu, Jie Zhang, Kuiyu Chang, and Chong Long. 2013. Uncovering collusive spammers in Chinese review Websites. In Proceedings of the ACM Conference of Information and Knowledge Management. ACM, New York, NY, 979--988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Yiming Yang. 1999. An evaluation of statistical approaches to text categorization. Journal of Information Retrieval 1, 1--2, 69--90. DOI:http:&sol;&sol;dx.doi.org&sol;10.1023&sol;A:1009982220290 Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Yiming Yang and Jan O. Pedersen. 1997. A comparative study on feature selection in text categorization. In Proceedings of the 14th International Conference on Machine Learning. 412--420. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Zhi Yang, Christo Wilson, Xiao Wang, Tingting Gao, Ben Y. Zhao, and Yafei Dai. 2014. Uncovering social network sybils in the wild. ACM Transactions on Knowledge Discovery from Data 8, 1, Article No. 2. DOI:http:&sol;&sol;dx.doi.org&sol;10.1145&sol;2556609 Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Haifeng Yu, Phillip B. Gibbons, Michael Kaminsky, and Feng Xiao. 2010. SybilLimit: A near-optimal social network defense against sybil attacks. IEEE&sol;ACM Transactions on Networking 18, 3, 885--898. DOI:http:&sol;&sol;dx.doi.org&sol;10.1109&sol;TNET.2009.2034047 Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Haifeng Yu, Michael Kaminsky, Phillip B. Gibbons, and Abraham D. Flaxman. 2008. SybilGuard: Defending against sybil attacks via social networks. IEEE&sol;ACM Transactions on Networking 16, 3, 576--589. DOI:http:&sol;&sol;dx.doi.org&sol;10.1109&sol;TNET.2008.923723 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Detection of Political Manipulation in Online Communities through Measures of Effort and Collaboration

      Recommendations

      Reviews

      Bálint Molnár

      Online social media is a popular research topic for both computer and social scientists as it can be investigated with robust analytic tools and reflects a certain slice of the mass behavior of society. It is interesting to question whether the actors of politics try to exploit the opportunities provided by online social media. The manipulators attempt to influence the opinions of readers participating in political debates within social networks. The question is how the manipulators can be detected using the published material. The paper develops a method for discerning collaborative efforts in a political discussion. For this purpose, 64 attributes of published texts are selected, and are later reduced to a smaller subset to distinguish between manipulators and non-manipulators in a more efficient way. Using machine learning and data mining technologies, a detection method is devised and implemented. As a proof of concept, a large data set from Korean social media is used for validating the proposed method. The paper meticulously describes the applied numerical parameters to compute statistical values that can be interpreted for classification purposes. The services of an open-source data mining package (Weka) are used for classifying the actors of political discussions. A comparison with existing methods is carried out to analyze the compliance of the proposed method for the claimed purpose. The paper is worth reading for researchers working on social media networks or investigating political manipulation. Online Computing Reviews Service

      Access critical reviews of Computing literature here

      Become a reviewer for Computing Reviews.

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!