skip to main content
research-article
Public Access

Using the Crowd to Prevent Harmful AI Behavior

Published:15 October 2020Publication History
Skip Abstract Section

Abstract

To prevent harmful AI behavior, people need to specify constraints that forbid undesirable actions. Unfortunately, this is a complex task, since writing rules that distinguish harmful from non-harmful actions tends to be quite difficult in real-world situations. Therefore, such decisions have historically been made by a small group of powerful AI companies and developers, with limited community input. In this paper, we study how to enable a crowd of non-AI experts to work together to communicate high-quality, reliable constraints to AI systems. We first focus on understanding how humans reason about temporal dynamics in the context of AI behavior, finding through experiments on a novel game-based testbed that participants tend to adopt a long-term notion of harm, even in uncertain situations that do not affect them directly. Building off of this insight, we explore task design for long-term constraint specification, developing new filtering approaches and new methods of promoting user reflection. Next, we develop a novel rule-based interface which allows people to craft rules in an accessible fashion without programming knowledge. We test our approaches on a real-world AI problem in the domain of education, and find that our new filtering mechanisms and interfaces significantly improve constraint quality and human efficiency. We also demonstrate how these systems can be applied to other real-world AI problems (e.g. in social networks).

Skip Supplemental Material Section

Supplemental Material

References

  1. Muhammad Ali, Piotr Sapiezynski, Miranda Bogen, Aleksandra Korolova, Alan Mislove, and Aaron Rieke. 2019. Discrimination through Optimization: How Facebook's Ad Delivery Can Lead to Biased Outcomes. Proceedings of the ACM on Human-Computer Interaction, Vol. 3, CSCW (2019), 1--30.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Ali Alkhatib and Michael Bernstein. 2019. Street-Level Algorithms: A Theory at the Gaps Between Policy and Decisions. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, 530.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Edmond Awad, Sohan Dsouza, Jean-Francc ois Bonnefon, Azim Shariff, and Iyad Rahwan. 2020. Crowdsourcing moral machines. Commun. ACM, Vol. 63, 3 (2020), 48--55.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Edmond Awad, Sohan Dsouza, Richard Kim, Jonathan Schulz, Joseph Henrich, Azim Shariff, Jean-Francc ois Bonnefon, and Iyad Rahwan. 2018. The moral machine experiment. Nature, Vol. 563, 7729 (2018), 59.Google ScholarGoogle Scholar
  5. Avinash Balakrishnan, Djallel Bouneffouf, Nicholas Mattei, and Francesca Rossi. 2018. Using Contextual Bandits with Behavioral Constraints for Constrained Online Movie Recommendation. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13--19, 2018, Stockholm, Sweden. 5802--5804.Google ScholarGoogle ScholarCross RefCross Ref
  6. Ruha Benjamin. 2019. Race after technology: Abolitionist tools for the new jim code. Social Forces (2019).Google ScholarGoogle Scholar
  7. Ian Bogost. 2018. Enough with the Trolley Problem. The Atlantic (2018).Google ScholarGoogle Scholar
  8. Jonathan Bragg, Mausam Mausam, and Daniel S Weld. 2016. Optimal testing for crowd workers. In Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems. International Foundation for Autonomous Agents and Multiagent Systems, 966--974.Google ScholarGoogle Scholar
  9. Rodney Brooks. 2017. Unexpected Consequences of Self-Driving Cars. rodneybrooks.com (2017).Google ScholarGoogle Scholar
  10. Michael Buhrmester, Tracy Kwang, and Samuel D Gosling. 2016. Amazon's Mechanical Turk: A new source of inexpensive, yet high-quality data? (2016).Google ScholarGoogle Scholar
  11. Yvonne Chen, Travis Mandel, Yun-En Liu, and Zoran Popović. 2016. Crowdsourcing Accurate and Creative Word Problems and Hints. AAAI HCOMP (2016).Google ScholarGoogle Scholar
  12. Albert Mo Kim Cheng, James C Browne, Aloysius K Mok, and R-H Wang. 1991. Estella; a facility for specifying behavioral constraint assertions in real-time rule-based systems. In COMPASS'91, Proceedings of the Sixth Annual Conference on Computer Assurance. IEEE, 107--123.Google ScholarGoogle Scholar
  13. Yinlam Chow, Ofir Nachum, Edgar Duenez-Guzman, and Mohammad Ghavamzadeh. 2018. A Lyapunov-based Approach to Safe Reinforcement Learning. arXiv preprint arXiv:1805.07708 (2018).Google ScholarGoogle Scholar
  14. Sasha Costanza-Chock. 2018. Design Justice: towards an intersectional feminist framework for design theory and practice. Proceedings of the Design Research Society (2018).Google ScholarGoogle ScholarCross RefCross Ref
  15. Peng Dai, Mausam Mausam, and Daniel S Weld. 2011. Artificial intelligence for artificial artificial intelligence. In Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence. AAAI Press, 1153--1159.Google ScholarGoogle Scholar
  16. Gal Dalal, Krishnamurthy Dvijotham, Matej Vecerik, Todd Hester, Cosmin Paduraru, and Yuval Tassa. 2018. Safe Exploration in Continuous Action Spaces. arXiv preprint arXiv:1801.08757 (2018).Google ScholarGoogle Scholar
  17. Shai Danziger, Jonathan Levav, and Liora Avnaim-Pesso. 2011. Extraneous factors in judicial decisions. Proceedings of the National Academy of Sciences, Vol. 108, 17 (2011), 6889--6892.Google ScholarGoogle ScholarCross RefCross Ref
  18. B. A. Davey and H.A Priestley. 1990. Introduction to Lattices and Order .Cambridge University Press.Google ScholarGoogle Scholar
  19. Greg d'Eon, Joslin Goh, Kate Larson, and Edith Law. 2019. Paying Crowd Workers for Collaborative Work. Proceedings of the ACM on Human-Computer Interaction, Vol. 3, CSCW (2019), 1--24.Google ScholarGoogle Scholar
  20. Deven R Desai and Joshua A Kroll. 2017. Trust but verify: A guide to algorithms and the law. Harv. JL & Tech., Vol. 31 (2017), 1.Google ScholarGoogle Scholar
  21. Giada Di Stefano, Francesca Gino, Gary Pisano, and Bradley Staats. 2014. Learning by Thinking: Overcoming Bias for Action through Reflection. Harvard Business School Working Paper Series, Vol. 58, 14-093 (March 2014).Google ScholarGoogle Scholar
  22. Brendan Dixon. 2020. The moral machine is bad news for AI ethics. Mind Matters News (2020).Google ScholarGoogle Scholar
  23. Shayan Doroudi, Ece Kamar, Emma Brunskill, and Eric Horvitz. 2016. Toward a learning science for complex crowdsourcing tasks. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. 2623--2634.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Alexey Dosovitskiy, German Ros, Felipe Codevilla, Antonio Lopez, and Vladlen Koltun. 2017. CARLA: An open urban driving simulator. arXiv preprint arXiv:1711.03938 (2017).Google ScholarGoogle Scholar
  25. Ryan Drapeau, Lydia B Chilton, Jonathan Bragg, and Daniel S Weld. 2016. Microtalk: Using argumentation to improve crowdsourcing accuracy. In Fourth AAAI Conference on Human Computation and Crowdsourcing.Google ScholarGoogle ScholarCross RefCross Ref
  26. Jie Gao, Hankz Hankui Zhuo, Subbarao Kambhampati, and Lei Li. 2015. Acquiring Planning Knowledge via Crowdsourcing. In Third AAAI Conference on Human Computation and Crowdsourcing.Google ScholarGoogle Scholar
  27. Yotam Gingold, Etienne Vouga, Eitan Grinspun, and Haym Hirsh. 2012. Diamonds from the rough: Improving drawing, painting, and singing via crowdsourcing. In Workshops at the Twenty-Sixth AAAI Conference on Artificial Intelligence.Google ScholarGoogle Scholar
  28. Meghan Holohan. 2018. Her baby was stillborn, but the ads just kept coming: One mother shares her pain. Today, https://www.today.com/parents/gillian-brockell-s-open-letter-tech-companies-goes-viral-t145124.Google ScholarGoogle Scholar
  29. Bert I Huang. 2019. LAW'S HALO AND THE MORAL MACHINE. Columbia Law Review, Vol. 119, 7 (2019), 1811--1828.Google ScholarGoogle Scholar
  30. Lilly C Irani and M Six Silberman. 2013. Turkopticon: Interrupting worker invisibility in amazon mechanical turk. In Proceedings of the SIGCHI conference on human factors in computing systems. 611--620.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Pratyusha Kalluri. 2020. Don't ask if artificial intelligence is good or fair, ask how it shifts power. Nature, Vol. 583, 7815 (2020), 169--169.Google ScholarGoogle Scholar
  32. Walter S Lasecki, Adam Marcus, Jeffrey M Rzeszotarski, and Jeffrey P Bigham. 2014. Using microtask continuity to improve crowdsourcing. Technical Report.Google ScholarGoogle Scholar
  33. Glen D Lawrence. 2013. Dietary fats and health: dietary recommendations in the context of scientific evidence. Advances in nutrition, Vol. 4, 3 (2013), 294--302.Google ScholarGoogle Scholar
  34. Sang Won Lee, Rebecca Krosnick, Sun Young Park, Brandon Keelean, Sach Vaidya, Stephanie D O'Keefe, and Walter S Lasecki. 2018. Exploring real-time collaboration in crowd-powered systems through a ui design tool. Proceedings of the ACM on Human-Computer Interaction, Vol. 2, CSCW (2018), 1--23.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Jan Leike, Miljan Martic, Victoria Krakovna, Pedro A Ortega, Tom Everitt, Andrew Lefrancq, Laurent Orseau, and Shane Legg. 2017. Ai safety gridworlds. arXiv preprint arXiv:1711.09883 (2017).Google ScholarGoogle Scholar
  36. Tianyi Li, Chandler J Manns, Chris North, and Kurt Luther. 2019. Dropping the baton? Understanding errors and bottlenecks in a crowdsourced sensemaking pipeline. Proceedings of the ACM on Human-Computer Interaction, Vol. 3, CSCW (2019), 1--26.Google ScholarGoogle Scholar
  37. Maximilian Mackeprang, Claudia Müller-Birn, and Maximilian Timo Stauss. 2019. Discovering the Sweet Spot of Human-Computer Configurations: A Case Study in Information Extraction. Proceedings of the ACM on Human-Computer Interaction, Vol. 3, CSCW (2019), 1--30.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Winter Mason and Siddharth Suri. 2012. Conducting behavioral research on Amazon's Mechanical Turk. Behavior research methods, Vol. 44, 1 (2012), 1--23.Google ScholarGoogle Scholar
  39. David McAllester and Jeffrey Mark Siskind. 1993. Nondeterministic lisp as a substrate for constraint logic programming. In Proceedings AAAI. 133--138.Google ScholarGoogle Scholar
  40. George A Miller. 1956. The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological review, Vol. 63, 2 (1956), 81.Google ScholarGoogle Scholar
  41. Tanushree Mitra, Clayton J Hutto, and Eric Gilbert. 2015. Comparing person-and process-centric strategies for obtaining quality data on amazon mechanical turk. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. ACM, 1345--1354.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Deirdre K Mulligan, Joshua A Kroll, Nitin Kohli, and Richmond Y Wong. 2019. This Thing Called Fairness: Disciplinary Confusion Realizing a Value in Technology. Proceedings of the ACM on Human-Computer Interaction, Vol. 3, CSCW (2019), 1--36.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. David Oleson, Alexander Sorokin, Greg P Laughlin, Vaughn Hester, John Le, and Lukas Biewald. 2011. Programmatic Gold: Targeted and Scalable Quality Assurance in Crowdsourcing. Human computation, Vol. 11, 11 (2011).Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Barry O'Sullivan. 2002. Interactive constraint-aided conceptual design. AI EDAM, Vol. 16, 4 (2002), 303--328.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Raja Parasuraman, Thomas B Sheridan, and Christopher D Wickens. 2000. A model for types and levels of human interaction with automation. IEEE Transactions on systems, man, and cybernetics-Part A: Systems and Humans, Vol. 30, 3 (2000), 286--297.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Singer Peter. 1981. The expanding circle: ethics and sociobiology. New York: Farrar, Straus and Giroux (1981).Google ScholarGoogle Scholar
  47. Renee M Petrilli, Gregory D Roach, Drew Dawson, and Nicole Lamond. 2006. The sleep, subjective fatigue, and sustained attention of commercial airline pilots during an international pattern. Chronobiology international, Vol. 23, 6 (2006), 1357--1362.Google ScholarGoogle ScholarCross RefCross Ref
  48. Niki Pfeifer and Gernot D Kleiter. 2007. Human reasoning with imprecise probabilities: Modus ponens and Denying the antecedent. In 5th International symposium on imprecise probability: Theories and applications. 347--356.Google ScholarGoogle Scholar
  49. Anatol Rapoport, Albert M Chammah, and Carol J Orwant. 1965. Prisoner's dilemma: A study in conflict and cooperation. Vol. 165. University of Michigan press.Google ScholarGoogle Scholar
  50. William Saunders, Girish Sastry, Andreas Stuhlmueller, and Owain Evans. 2017. Trial without Error: Towards Safe Reinforcement Learning via Human Intervention. arXiv preprint arXiv:1707.05173 (2017).Google ScholarGoogle Scholar
  51. William Saunders, Girish Sastry, Andreas Stuhlmueller, and Owain Evans. 2018. Trial without error: Towards safe reinforcement learning via human intervention. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems. International Foundation for Autonomous Agents and Multiagent Systems, 2067--2069.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Stephen Stich. 2007. Evolution, altruism and cognitive architecture: a critique of Sober and Wilson's argument for psychological altruism. Biology & Philosophy, Vol. 22, 2 (2007), 267--281.Google ScholarGoogle ScholarCross RefCross Ref
  53. Richard S Sutton, Andrew G Barto, et almbox. 1998. Reinforcement learning: An introduction .MIT press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Chen Tessler, Daniel J Mankowitz, and Shie Mannor. 2018. Reward Constrained Policy Optimization. arXiv preprint arXiv:1805.11074 (2018).Google ScholarGoogle Scholar
  55. Philip S Thomas, Bruno Castro da Silva, Andrew G Barto, Stephen Giguere, Yuriy Brun, and Emma Brunskill. 2019. Preventing undesirable behavior of intelligent machines. Science, Vol. 366, 6468 (2019), 999--1004.Google ScholarGoogle Scholar
  56. Niels van Berkel, Jorge Goncalves, Danula Hettiachchi, Senuri Wijenayake, Ryan M Kelly, and Vassilis Kostakos. 2019. Crowdsourcing Perceptions of Fair Predictors for Machine Learning: A Recidivism Case Study. Proceedings of the ACM on Human-Computer Interaction, Vol. 3, CSCW (2019), 1--21.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. George Christopher Williams. 1966. Adaptation and natural selection: A critique of some current evolutionary thought .Princeton university press.Google ScholarGoogle Scholar
  58. Christine Wolf and Jeanette Blomberg. 2019. Evaluating the Promise of Human-Algorithm Collaborations in Everyday Work Practices. Proceedings of the ACM on Human-Computer Interaction, Vol. 3, CSCW (2019), 1--23.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Zijian Zhang, Jaspreet Singh, Ujwal Gadiraju, and Avishek Anand. 2019. Dissonance between human and machine understanding. Proceedings of the ACM on Human-Computer Interaction, Vol. 3, CSCW (2019), 1--23.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Hankz Hankui Zhuo. 2015. Crowdsourced Action-Model Acquisition for Planning.. In AAAI. 3439--3446.Google ScholarGoogle Scholar
  61. Martin Zwick and Jeffrey A Fletcher. 2014. Levels of altruism. Biological Theory, Vol. 9, 1 (2014), 100--107.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Using the Crowd to Prevent Harmful AI Behavior

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!