skip to main content

Learning-based controlled concurrency testing

Published:13 November 2020Publication History
Skip Abstract Section

Abstract

Concurrency bugs are notoriously hard to detect and reproduce. Controlled concurrency testing (CCT) techniques aim to offer a solution, where a scheduler explores the space of possible interleavings of a concurrent program looking for bugs. Since the set of possible interleavings is typically very large, these schedulers employ heuristics that prioritize the search to “interesting” subspaces. However, current heuristics are typically tuned to specific bug patterns, which limits their effectiveness in practice.

In this paper, we present QL, a learning-based CCT framework where the likelihood of an action being selected by the scheduler is influenced by earlier explorations. We leverage the classical Q-learning algorithm to explore the space of possible interleavings, allowing the exploration to adapt to the program under test, unlike previous techniques. We have implemented and evaluated QL on a set of microbenchmarks, complex protocols, as well as production cloud services. In our experiments, we found QL to consistently outperform the state-of-the-art in CCT.

Skip Supplemental Material Section

Supplemental Material

Auxiliary Presentation Video

This video describes our work on leveraging learning-based techniques for controlled concurrency testing. Our paper is part of the Research Track at OOPSLA 2020. In this paper, we present QL, a learning-based CCT framework where the likelihood of an action being selected by the scheduler is influenced by earlier explorations. We leverage the classical Q-learning algorithm to explore the space of possible interleavings, allowing the exploration to adapt to the program under test, unlike previous techniques. We have implemented and evaluated QL on a set of microbenchmarks, complex protocols, as well as production cloud services. In our experiments, we found QL to consistently outperform the state-of-the-art in CCT.

References

  1. Akka Raft. 2015. Leader election bug in Akka Raft implementation. https://github.com/ktoso/akka-raft/issues/45.Google ScholarGoogle Scholar
  2. Amazon. 2012. Summary of the AWS service event in the US East Region. http://aws.amazon.com/message/67457/.Google ScholarGoogle Scholar
  3. Tony Andrews, Shaz Qadeer, Sriram K. Rajamani, Jakob Rehof, and Yichen Xie. 2004. Zing: A Model Checker for Concurrent Software. In Computer Aided Verification, 16th International Conference, CAV 2004, Boston, MA, USA, July 13-17, 2004, Proceedings. 484-487.Google ScholarGoogle Scholar
  4. Andrew G Barto and Satinder Pal Singh. 1991. On the computational economics of reinforcement learning. In Connectionist Models. Elsevier, 35-44.Google ScholarGoogle Scholar
  5. Nicolas Baskiotis, Michèle Sebag, Marie-Claude Gaudel, and Sandrine Gouraud. 2007. A machine learning approach for statistical software testing. In Proceedings of the 20th International Joint Conference on Artifical Intelligence. Morgan Kaufmann Publishers Inc., 2274-2279.Google ScholarGoogle Scholar
  6. Richard Bellman et al. 1954. The theory of dynamic programming. Bull. Amer. Math. Soc. 60, 6 ( 1954 ), 503-515.Google ScholarGoogle Scholar
  7. Dirk Beyer. 2019. Automatic Verification of C and Java Programs: SV-COMP 2019. In Tools and Algorithms for the Construction and Analysis of Systems-25 Years of TACAS: TOOLympics, Held as Part of ETAPS 2019, Prague, Czech Republic, April 6-11, 2019, Proceedings, Part III (Lecture Notes in Computer Science, Vol. 11429 ), Dirk Beyer, Marieke Huisman, Fabrice Kordon, and Bernhard Stefen (Eds.). Springer, 133-155. https://doi.org/10.1007/978-3-030-17502-3_9 Google ScholarGoogle ScholarCross RefCross Ref
  8. Konstantin Böttinger, Patrice Godefroid, and Rishabh Singh. 2018. Deep Reinforcement Fuzzing. In 2018 IEEE Security and Privacy Workshops, SP Workshops 2018, San Francisco, CA, USA, May 24, 2018. IEEE Computer Society, 116-122. https://doi.org/10.1109/SPW. 2018.00026 Google ScholarGoogle ScholarCross RefCross Ref
  9. Sebastian Burckhardt, Pravesh Kothari, Madanlal Musuvathi, and Santosh Nagarakatte. 2010. A randomized scheduler with probabilistic guarantees of finding bugs. In Proceedings of the 15th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2010, Pittsburgh, Pennsylvania, USA, March 13-17, 2010, James C. Hoe and Vikram S. Adve (Eds.). ACM, 167-178. https://doi.org/10.1145/1736020.1736040 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Qingpeng Cai, Aris Filos-Ratsikas, Pingzhong Tang, and Yiwei Zhang. 2018. Reinforcement mechanism design for fraudulent behaviour in e-commerce. In Thirty-Second AAAI Conference on Artificial Intelligence.Google ScholarGoogle Scholar
  11. Tom Cargill. 2009. Extreme Programming Challenge Fourteen. http://wiki.c2.com/ ?ExtremeProgrammingChallengeFourteen.Google ScholarGoogle Scholar
  12. Marek Chalupa, Krishnendu Chatterjee, Andreas Pavlogiannis, Nishant Sinha, and Kapil Vaidya. 2018. Data-centric dynamic partial order reduction. Proc. ACM Program. Lang. 2, POPL ( 2018 ), 31 : 1-31 : 30.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Alessandro Cimatti, Edmund M. Clarke, Fausto Giunchiglia, and Marco Roveri. 2000. NUSMV: A New Symbolic Model Checker. STTT 2, 4 ( 2000 ), 410-425.Google ScholarGoogle Scholar
  14. Edmund M. Clarke, Kenneth L. McMillan, Sérgio Vale Aguiar Campos, and Vasiliki Hartonas-Garmhausen. 1996. Symbolic Model Checking. In Computer Aided Verification, 8th International Conference, CAV ' 96, New Brunswick, NJ, USA, July 31-August 3, 1996, Proceedings. 419-427.Google ScholarGoogle Scholar
  15. Pantazis Deligiannis, Alastair F. Donaldson, Jeroen Ketema, Akash Lal, and Paul Thomson. 2015. Asynchronous programming, analysis and testing with state machines. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015, David Grove and Steve Blackburn (Eds.). ACM, 154-164. https://doi.org/10.1145/2737924.2737996 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Pantazis Deligiannis, Narayanan Ganapathy, Akash Lal, and Shaz Qadeer. 2020. Building Reliable Cloud Services Using P# (Experience Report ). ArXiv abs/ 2002.04903 ( 2020 ).Google ScholarGoogle Scholar
  17. Pantazis Deligiannis, Matt McCutchen, Paul Thomson, Shuo Chen, Alastair F. Donaldson, John Erickson, Cheng Huang, Akash Lal, Rashmi Mudduluru, Shaz Qadeer, and Wolfram Schulte. 2016. Uncovering Bugs in Distributed Storage Systems during Testing (Not in Production!). In 14th USENIX Conference on File and Storage Technologies, FAST 2016, Santa Clara, CA, USA, February 22-25, 2016., Angela Demke Brown and Florentina I. Popovici (Eds.). USENIX Association, 249-262. https://www.usenix.org/conference/fast16/technical-sessions/presentation/deligiannisGoogle ScholarGoogle Scholar
  18. Ankush Desai, Shaz Qadeer, and Sanjit A. Seshia. 2015. Systematic testing of asynchronous reactive systems. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2015, Bergamo, Italy, August 30-September 4, 2015, Elisabetta Di Nitto, Mark Harman, and Patrick Heymans (Eds.). ACM, 73-83. https://doi.org/10.1145/2786805. 2786861 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Michael Emmi, Shaz Qadeer, and Zvonimir Rakamaric. 2011. Delay-bounded scheduling. In Proceedings of the 38th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2011, Austin, TX, USA, January 26-28, 2011, Thomas Ball and Mooly Sagiv (Eds.). ACM, 411-422. https://doi.org/10.1145/1926385.1926432 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Anders Eriksson, Genci Capi, and Kenji Doya. 2003. Evolution of meta-parameters in reinforcement learning algorithm. In Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003 ) (Cat. No. 03CH37453), Vol. 1. IEEE, 412-417. https://ieeexplore.ieee.org/document/1250664Google ScholarGoogle ScholarCross RefCross Ref
  21. Franklin Cardeñoso Fernandez and Wouter Caarls. 2018. Parameters tuning and optimization for reinforcement learning algorithms using evolutionary computing. In 2018 International Conference on Information Systems and Computer Science (INCISCOS). IEEE, 301-305. https://ieeexplore.ieee.org/document/8564542Google ScholarGoogle ScholarCross RefCross Ref
  22. Cormac Flanagan and Patrice Godefroid. 2005. Dynamic partial-order reduction for model checking software. In Proceedings of the 32nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2005, Long Beach, California, USA, January 12-14, 2005. 110-121.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Patrice Godefroid. 2005. Software Model Checking: The VeriSoft Approach. Formal Methods in System Design 26, 2 ( 2005 ), 77-101. https://doi.org/10.1007/s10703-005-1489-x Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Patrice Godefroid, Hila Peleg, and Rishabh Singh. 2017. Learn&Fuzz: machine learning for input fuzzing. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering, ASE 2017, Urbana, IL, USA, October 30-November 03, 2017, Grigore Rosu, Massimiliano Di Penta, and Tien N. Nguyen (Eds.). IEEE Computer Society, 50-59. https://doi.org/10.1109/ASE. 2017.8115618 Google ScholarGoogle ScholarCross RefCross Ref
  25. Jim Gray. 1986. Why do computers stop and what can be done about it?. In Proceedings of the 5th Symposium on Reliability in Distributed Software and Database Systems. IEEE, 3-12.Google ScholarGoogle Scholar
  26. Verena Heidrich-Meisner, Martin Lauer, Christian Igel, and Martin A Riedmiller. 2007. Reinforcement learning in a nutshell.. In ESANN. Citeseer, 277-288.Google ScholarGoogle Scholar
  27. Gerard Holzmann. 2011. The SPIN Model Checker: Primer and Reference Manual (1st ed.). Addison-Wesley Professional.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Jef Huang. 2015. Stateless model checking concurrent programs with maximal causality reduction. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015. 165-174.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Shiyou Huang and Jef Huang. 2017. Speeding Up Maximal Causality Reduction with Static Dependency Analysis. In 31st European Conference on Object-Oriented Programming, ECOOP 2017, June 19-23, 2017, Barcelona, Spain. 16 : 1-16 : 22.Google ScholarGoogle Scholar
  30. Harshad Khadilkar. 2018. A Scalable Reinforcement Learning Algorithm for Scheduling Railway Lines. IEEE Transactions on Intelligent Transportation Systems 20, 2 ( 2018 ), 727-736.Google ScholarGoogle Scholar
  31. Jens Kober, J Andrew Bagnell, and Jan Peters. 2013. Reinforcement learning in robotics: A survey. The International Journal of Robotics Research 32, 11 ( 2013 ), 1238-1274.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Sascha Lange, Martin Riedmiller, and Arne Voigtländer. 2012. Autonomous reinforcement learning on raw visual input data in a real world application. In The 2012 International Joint Conference on Neural Networks (IJCNN). IEEE, 1-8.Google ScholarGoogle ScholarCross RefCross Ref
  33. Tanakorn Leesatapornwongsa, Mingzhe Hao, Pallavi Joshi, Jefrey F. Lukman, and Haryadi S. Gunawi. 2014. SAMC: Semantic-Aware Model Checking for Fast Discovery of Deep Bugs in Cloud Systems. In 11th USENIX Symposium on Operating Systems Design and Implementation, OSDI '14, Broomfield, CO, USA, October 6-8, 2014. 399-414.Google ScholarGoogle Scholar
  34. Sergey Levine, Chelsea Finn, Trevor Darrell, and Pieter Abbeel. 2016. End-to-end training of deep visuomotor policies. The Journal of Machine Learning Research 17, 1 ( 2016 ), 1334-1373.Google ScholarGoogle Scholar
  35. Dong Li, Dongbin Zhao, Qichao Zhang, and Yaran Chen. 2019. Reinforcement Learning and Deep Learning Based Lateral Control for Autonomous Driving [Application Notes]. IEEE Computational Intelligence Magazine 14, 2 ( 2019 ), 83-98.Google ScholarGoogle ScholarCross RefCross Ref
  36. Richard J. Lipton. 1975. Reduction: A Method of Proving Properties of Parallel Programs. Commun. ACM 18, 12 (Dec. 1975 ), 717-721.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Leonardo Mariani, Mauro Pezzè, Oliviero Riganelli, and Mauro Santoro. 2012. AutoBlackTest: Automatic Black-Box Testing of Interactive Applications. In Fifth IEEE International Conference on Software Testing, Verification and Validation, ICST 2012, Montreal, QC, Canada, April 17-21, 2012, Giuliano Antoniol, Antonia Bertolino, and Yvan Labiche (Eds.). IEEE Computer Society, 81-90. https://doi.org/10.1109/ICST. 2012.88 Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Antoni W. Mazurkiewicz. 1986. Trace Theory. In Petri Nets: Central Models and Their Properties, Advances in Petri Nets 1986, Part II, Proceedings of an Advanced Course, Bad Honnef, Germany, 8-19 September 1986. 279-324.Google ScholarGoogle Scholar
  39. Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin A. Riedmiller, Andreas Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 ( 2015 ), 529-533. https://doi.org/10.1038/nature14236 Google ScholarGoogle ScholarCross RefCross Ref
  40. Rashmi Mudduluru, Pantazis Deligiannis, Ankush Desai, Akash Lal, and Shaz Qadeer. 2017. Lasso detection using partialstate caching. In 2017 Formal Methods in Computer Aided Design, FMCAD 2017, Vienna, Austria, October 2-6, 2017, Daryl Stewart and Georg Weissenbacher (Eds.). IEEE, 84-91. https://doi.org/10.23919/FMCAD. 2017.8102245 Google ScholarGoogle ScholarCross RefCross Ref
  41. Madanlal Musuvathi and Shaz Qadeer. 2007. Iterative context bounding for systematic testing of multithreaded programs. In Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, San Diego, California, USA, June 10-13, 2007. 446-455.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Madanlal Musuvathi and Shaz Qadeer. 2008. Fair stateless model checking. In Proceedings of the ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation, Tucson, AZ, USA, June 7-13, 2008, Rajiv Gupta and Saman P. Amarasinghe (Eds.). ACM, 362-371. https://doi.org/10.1145/1375581.1375625 Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Madanlal Musuvathi, Shaz Qadeer, Thomas Ball, Gérard Basler, Piramanayagam Arumuga Nainar, and Iulian Neamtiu. 2008. Finding and Reproducing Heisenbugs in Concurrent Programs. In 8th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2008, December 8-10, 2008, San Diego, California, USA, Proceedings, Richard Draves and Robbert van Renesse (Eds.). USENIX Association, 267-280. http://www.usenix.org/events/osdi08/tech/full_papers/musuvathi/ musuvathi.pdfGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  44. Emre O Neftci and Bruno B Averbeck. 2019. Reinforcement learning in artificial and biological systems. Nature Machine Intelligence 1, 3 ( 2019 ), 133-143.Google ScholarGoogle Scholar
  45. Matthew O'Kelly, Aman Sinha, Hongseok Namkoong, Russ Tedrake, and John C Duchi. 2018. Scalable end-to-end autonomous vehicle testing via rare-event simulation. In Advances in Neural Information Processing Systems. 9827-9838.Google ScholarGoogle Scholar
  46. Diego Ongaro and John Ousterhout. 2014. In search of an understandable consensus algorithm. In Proceedings of the 2014 USENIX Annual Technical Conference. USENIX, 305-319.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Burcu Kulahcioglu Ozkan, Rupak Majumdar, Filip Niksic, Mitra Tabaei Befrouei, and Georg Weissenbacher. 2018. Randomized testing of distributed systems with probabilistic guarantees. PACMPL 2, OOPSLA ( 2018 ), 160 : 1-160 : 28.Google ScholarGoogle Scholar
  48. P# Team. 2019. P# : A framework for rapid development of reliable asynchronous software. https://github.com/p-org/PSharp.Google ScholarGoogle Scholar
  49. Ketan Patil and Aditya Kanade. 2018. Greybox fuzzing as a contextual bandits problem. CoRR abs/ 1806.03806 ( 2018 ). arXiv: 1806.03806 http://arxiv.org/abs/ 1806.03806Google ScholarGoogle Scholar
  50. Jing Peng and Ronald J Williams. 1996. Incremental multi-step Q-learning. Machine Learning 22, 1-3 ( 1996 ), 283-290.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Gavin A Rummery and Mahesan Niranjan. 1994. On-line Q-learning using connectionist systems. University of Cambridge, Department of Engineering.Google ScholarGoogle Scholar
  52. Stuart Jonathan Russell, Peter Norvig, John F Canny, Jitendra M Malik, and Douglas D Edwards. 2003. Artificial intelligence: a modern approach. Vol. 2. Prentice hall Upper Saddle River.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Mitsuo Sato, Kenichi Abe, and Hiroshi Takeda. 1988. Learning control of finite Markov chains with an explicit trade-of between estimation and control. IEEE transactions on systems, man, and cybernetics 18, 5 ( 1988 ), 677-684.Google ScholarGoogle ScholarCross RefCross Ref
  54. Dongdong She, Kexin Pei, Dave Epstein, Junfeng Yang, Baishakhi Ray, and Suman Jana. 2018. NEUZZ: Eficient Fuzzing with Neural Program Learning. CoRR abs/ 1807.05620 ( 2018 ). arXiv: 1807.05620 http://arxiv.org/abs/ 1807.05620Google ScholarGoogle Scholar
  55. Jing-Cheng Shi, Yang Yu, Qing Da, Shi-Yong Chen, and An-Xiang Zeng. 2019. Virtual-taobao: Virtualizing real-world online retail environment for reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 4902-4909.Google ScholarGoogle ScholarCross RefCross Ref
  56. David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Vedavyas Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy P. Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel, and Demis Hassabis. 2016. Mastering the game of Go with deep neural networks and tree search. Nature 529, 7587 ( 2016 ), 484-489. https://doi.org/10.1038/nature16961 Google ScholarGoogle ScholarCross RefCross Ref
  57. Jirí Simsa, Randy Bryant, and Garth A. Gibson. 2011. dBug: Systematic Testing of Unmodified Distributed and Multithreaded Systems. In Model Checking Software-18th International SPIN Workshop, Snowbird, UT, USA, July 14-15, 2011. Proceedings (Lecture Notes in Computer Science, Vol. 6823 ), Alex Groce and Madanlal Musuvathi (Eds.). Springer, 188-193. https://doi.org/10.1007/978-3-642-22306-8_14 Google ScholarGoogle ScholarCross RefCross Ref
  58. Richard S Sutton and Andrew G Barto. 1998. Reinforcement learning: An introduction. MIT press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Csaba Szepesvári. 2010. Algorithms for reinforcement learning. Synthesis lectures on artificial intelligence and machine learning 4, 1 ( 2010 ), 1-103.Google ScholarGoogle Scholar
  60. Gregory Tassey. 2002. The economic impacts of inadequate infrastructure for software testing. National Institute of Standards and Technology, Planning Report 02-3 ( 2002 ).Google ScholarGoogle Scholar
  61. Gerald Tesauro. 1991. Practical Issues in Temporal Diference Learning. In Advances in Neural Information Processing Systems 4, [NIPS Conference, Denver, Colorado, USA, December 2-5, 1991 ], John E. Moody, Stephen Jose Hanson, and Richard Lippmann (Eds.). Morgan Kaufmann, 259-266. http://papers.nips.cc/paper/465-practical-issues-in-temporal-diference-learningGoogle ScholarGoogle Scholar
  62. Paul Thomson, Alastair F. Donaldson, and Adam Betts. 2016. Concurrency Testing Using Controlled Schedulers: An Empirical Study. TOPC 2, 4 ( 2016 ), 23 : 1-23 : 37. https://doi.org/10.1145/2858651 Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Ben Treynor. 2014. GoogleBlog-Today's outage for several Google services. http://googleblog.blogspot.com/ 2014 /01/todaysoutage-for-several-google.html.Google ScholarGoogle Scholar
  64. Margus Veanes, Pritam Roy, and Colin Campbell. 2006. Online Testing with Reinforcement Learning. In Formal Approaches to Software Testing and Runtime Verification, Klaus Havelund, Manuel Núñez, Grigore Roşu, and Burkhart Wolf (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 240-253.Google ScholarGoogle Scholar
  65. Dmitry Vyukov. 2010. Bug with a context switch bound 5. https://social.msdn.microsoft.com/Forums/en-US/ 91c1971c-519f4ad2-816d-149e6b2fd916/bug-with-a-context-switch-bound-5?forum=chess.Google ScholarGoogle Scholar
  66. Chris Watkins. 1989a. Models of Delayed Reinforcement Learning. In PhD thesis, Psychology Department, Cambridge University.Google ScholarGoogle Scholar
  67. Christopher JCH Watkins and Peter Dayan. 1992. Q-learning. Machine learning 8, 3-4 ( 1992 ), 279-292.Google ScholarGoogle Scholar
  68. Christopher John Cornish Hellaby Watkins. 1989b. Learning from delayed rewards. ( 1989 ).Google ScholarGoogle Scholar
  69. Hillel Wayne. 2018. Augmenting Agile with Formal Methods. https://www.hillelwayne.com/post/augmenting-agile/.Google ScholarGoogle Scholar
  70. Hua Wei, Guanjie Zheng, Huaxiu Yao, and Zhenhui Li. 2018. Intellilight: A reinforcement learning approach for intelligent trafic light control. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2496-2505.Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. S Whitehead. 1991a. A Study of Cooperative Mechanisms for Faster Reinforcement Learning Univ. Rochester, Rochester. Technical Report. NY, Tech. Rep. TR-365.Google ScholarGoogle Scholar
  72. Steven D Whitehead. 1991b. Complexity and cooperation in Q-learning. In Machine Learning Proceedings 1991. Elsevier, 363-367.Google ScholarGoogle ScholarCross RefCross Ref
  73. Junfeng Yang, Tisheng Chen, Ming Wu, Zhilei Xu, Xuezheng Liu, Haoxiang Lin, Mao Yang, Fan Long, Lintao Zhang, and Lidong Zhou. 2009. MODIST: Transparent Model Checking of Unmodified Distributed Systems. In Proceedings of the 6th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2009, April 22-24, 2009, Boston, MA, USA. 213-228.Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. Alice X. Zheng, Michael I. Jordan, Ben Liblit, and Alexander Aiken. 2003. Statistical Debugging of Sampled Programs. In Advances in Neural Information Processing Systems 16 [Neural Information Processing Systems, NIPS 2003, December 8-13, 2003, Vancouver and Whistler, British Columbia, Canada], Sebastian Thrun, Lawrence K. Saul, and Bernhard Schölkopf (Eds.). MIT Press, 603-610. http://papers.nips.cc/paper/2371-statistical-debugging-of-sampled-programsGoogle ScholarGoogle Scholar
  75. Alice X. Zheng, Michael I. Jordan, Ben Liblit, Mayur Naik, and Alex Aiken. 2006. Statistical debugging: simultaneous identification of multiple bugs. In Machine Learning, Proceedings of the Twenty-Third International Conference (ICML 2006 ), Pittsburgh, Pennsylvania, USA, June 25-29, 2006 (ACM International Conference Proceeding Series, Vol. 148 ), William W. Cohen and Andrew Moore (Eds.). ACM, 1105-1112. https://doi.org/10.1145/1143844.1143983 Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Zhenpeng Zhou, Xiaocheng Li, and Richard N Zare. 2017. Optimizing chemical reactions with deep reinforcement learning. ACS central science 3, 12 ( 2017 ), 1337-1344.Google ScholarGoogle Scholar

Index Terms

  1. Learning-based controlled concurrency testing

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!