Abstract
Byzantine fault-tolerant algorithms promise agreement on a correct value, even if a subset of processes can deviate from the algorithm arbitrarily. While these algorithms provide strong guarantees in theory, in practice, protocol bugs and implementation mistakes may still cause them to go wrong. This paper introduces ByzzFuzz, a simple yet effective method for automatically finding errors in implementations of Byzantine fault-tolerant algorithms through randomized testing. ByzzFuzz detects fault-tolerance bugs by injecting randomly generated network and process faults into their executions. To navigate the space of possible process faults, ByzzFuzz introduces small-scope message mutations which mutate the contents of the protocol messages by applying small changes to the original message either in value (e.g., by incrementing the round number) or in time (e.g., by repeating a proposal value from a previous message). We find that small-scope mutations, combined with insights from the testing and fuzzing literature, are effective at uncovering protocol logic and implementation bugs in real-world fault-tolerant systems.
We implemented ByzzFuzz and applied it to test the production implementations of two popular blockchain systems, Tendermint and Ripple, and an implementation of the seminal PBFT protocol. ByzzFuzz detected several bugs in the implementation of PBFT, a potential liveness violation in Tendermint, and materialized two theoretically described vulnerabilities in Ripple’s XRP Ledger Consensus Algorithm. Moreover, we discovered a previously unknown fault-tolerance bug in the production implementation of Ripple, which is confirmed by the developers and fixed.
- Ittai Abraham, Guy Gueta, Dahlia Malkhi, Lorenzo Alvisi, Ramakrishna Kotla, and Jean-Philippe Martin. 2017. Revisiting Fast Practical Byzantine Fault Tolerance. CoRR, abs/1712.01367 (2017), arXiv:1712.01367. arxiv:1712.01367
Google Scholar
- Peter Alvaro, Kolton Andrus, Chris Sanden, Casey Rosenthal, Ali Basiri, and Lorin Hochstein. 2016. Automating Failure Testing Research at Internet Scale. In Proceedings of the Seventh ACM Symposium on Cloud Computing, Santa Clara, CA, USA, October 5-7, 2016, Marcos K. Aguilera, Brian Cooper, and Yanlei Diao (Eds.). ACM, 17–28. https://doi.org/10.1145/2987550.2987555
Google Scholar
Digital Library
- Peter Alvaro and Kyle Kingsbury. 2020. Elle: Inferring Isolation Anomalies from Experimental Observations. Proc. VLDB Endow., 14, 3 (2020), 268–280. https://doi.org/10.5555/3430915.3442427
Google Scholar
Digital Library
- Peter Alvaro, Joshua Rosen, and Joseph M. Hellerstein. 2015. Lineage-driven Fault Injection. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31 - June 4, 2015, Timos K. Sellis, Susan B. Davidson, and Zachary G. Ives (Eds.). ACM, 331–346. https://doi.org/10.1145/2723372.2723711
Google Scholar
Digital Library
- Benjamin Aminof, Sasha Rubin, Ilina Stoilkovska, Josef Widder, and Florian Zuleger. 2018. Parameterized Model Checking of Synchronous Distributed Algorithms by Abstraction. In Verification, Model Checking, and Abstract Interpretation - 19th International Conference, VMCAI 2018, Los Angeles, CA, USA, January 7-9, 2018, Proceedings, Isil Dillig and Jens Palsberg (Eds.) (Lecture Notes in Computer Science, Vol. 10747). Springer, 1–24. https://doi.org/10.1007/978-3-319-73721-8_1
Google Scholar
Cross Ref
- Ignacio Amores-Sesar, Christian Cachin, and Jovana Micic. 2020. Security Analysis of Ripple Consensus. In 24th International Conference on Principles of Distributed Systems, OPODIS 2020, December 14-16, 2020, Strasbourg, France (Virtual Conference), Quentin Bramas, Rotem Oshman, and Paolo Romano (Eds.) (LIPIcs, Vol. 184). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 10:1–10:16. https://doi.org/10.4230/LIPIcs.OPODIS.2020.10
Google Scholar
Cross Ref
- Yackolley Amoussou-Guenou, Antonella Del Pozzo, Maria Potop-Butucaru, and Sara Tucci Piergiovanni. 2019. Dissecting Tendermint. In Networked Systems - 7th International Conference, NETYS 2019, Marrakech, Morocco, June 19-21, 2019, Revised Selected Papers, Mohamed Faouzi Atig and Alexander A. Schwarzmann (Eds.) (Lecture Notes in Computer Science, Vol. 11704). Springer, 166–182. https://doi.org/10.1007/978-3-030-31277-0_11
Google Scholar
Digital Library
- Elli Androulaki, Artem Barger, Vita Bortnikov, Christian Cachin, Konstantinos Christidis, Angelo De Caro, David Enyeart, Christopher Ferris, Gennady Laventman, Yacov Manevich, Srinivasan Muralidharan, Chet Murthy, Binh Nguyen, Manish Sethi, Gari Singh, Keith Smith, Alessandro Sorniotti, Chrysoula Stathakopoulou, Marko Vukolic, Sharon Weed Cocco, and Jason Yellick. 2018. Hyperledger fabric: a distributed operating system for permissioned blockchains. In Proceedings of the Thirteenth EuroSys Conference, EuroSys 2018, Porto, Portugal, April 23-26, 2018, Rui Oliveira, Pascal Felber, and Y. Charlie Hu (Eds.). ACM, 30:1–30:15. https://doi.org/10.1145/3190508.3190538
Google Scholar
Digital Library
- Frederik Armknecht, Ghassan O. Karame, Avikarsha Mandal, Franck Youssef, and Erik Zenner. 2015. Ripple: Overview and Outlook. In Trust and Trustworthy Computing - 8th International Conference, TRUST 2015, Heraklion, Greece, August 24-26, 2015, Proceedings, Mauro Conti, Matthias Schunter, and Ioannis G. Askoxylakis (Eds.) (Lecture Notes in Computer Science, Vol. 9229). Springer, 163–180. https://doi.org/10.1007/978-3-319-22846-4_10
Google Scholar
Cross Ref
- Shehar Bano, Alberto Sonnino, Andrey Chursin, Dmitri Perelman, Zekun Li, Avery Ching, and Dahlia Malkhi. 2021. Twins: BFT Systems Made Robust. In 25th International Conference on Principles of Distributed Systems, OPODIS 2021, December 13-15, 2021, Strasbourg, France, Quentin Bramas, Vincent Gramoli, and Alessia Milani (Eds.) (LIPIcs, Vol. 217). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 7:1–7:29. https://doi.org/10.4230/LIPIcs.OPODIS.2021.7
Google Scholar
Cross Ref
- Mathieu Baudet, Avery Ching, Andrey Chursin, George Danezis, François Garillot, Zekun Li, Dahlia Malkhi, Oded Naor, Dmitri Perelman, and Alberto Sonnino. 2019. State machine replication in the libra blockchain. The Libra Assn., Tech. Rep, 7 (2019).
Google Scholar
- Cory Bennett and Ariel Tseitlin. 2012. Chaos monkey released into the wild. Netflix Tech Blog, 30 (2012), 1.
Google Scholar
- Christian Berger, Hans P. Reiser, and Alysson Bessani. 2021. Making Reads in BFT State Machine Replication Fast, Linearizable, and Live. In 40th International Symposium on Reliable Distributed Systems, SRDS 2021, Chicago, IL, USA, September 20-23, 2021. IEEE, 1–12. https://doi.org/10.1109/SRDS53918.2021.00010
Google Scholar
Cross Ref
- Nathalie Bertrand, Igor Konnov, Marijana Lazic, and Josef Widder. 2019. Verification of Randomized Consensus Algorithms Under Round-Rigid Adversaries. In 30th International Conference on Concurrency Theory, CONCUR 2019, August 27-30, 2019, Amsterdam, the Netherlands, Wan J. Fokkink and Rob van Glabbeek (Eds.) (LIPIcs, Vol. 140). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 33:1–33:15. https://doi.org/10.4230/LIPIcs.CONCUR.2019.33
Google Scholar
Cross Ref
- Ethan Buchman. 2016. Tendermint: Byzantine fault tolerance in the age of blockchains. Ph. D. Dissertation. University of Guelph.
Google Scholar
- Ethan Buchman, Rachid Guerraoui, Jovan Komatovic, Zarko Milosevic, Dragos-Adrian Seredinschi, and Josef Widder. 2022. Revisiting Tendermint: Design Tradeoffs, Accountability, and Practical Use. In 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2022, Supplemental Volume, Baltimore, MD, USA, June 27-30, 2022. IEEE, 11–14. https://doi.org/10.1109/DSN-S54099.2022.00014
Google Scholar
Cross Ref
- Ethan Buchman, Jae Kwon, and Zarko Milosevic. 2018. The latest gossip on BFT consensus. CoRR, abs/1807.04938 (2018), arXiv:1807.04938. arxiv:1807.04938
Google Scholar
- Sebastian Burckhardt, Pravesh Kothari, Madanlal Musuvathi, and Santosh Nagarakatte. 2010. A randomized scheduler with probabilistic guarantees of finding bugs. In Proceedings of the 15th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2010, Pittsburgh, Pennsylvania, USA, March 13-17, 2010, James C. Hoe and Vikram S. Adve (Eds.). ACM, 167–178. https://doi.org/10.1145/1736020.1736040
Google Scholar
Digital Library
- Christian Cachin, Rachid Guerraoui, and Luís E. T. Rodrigues. 2011. Introduction to Reliable and Secure Distributed Programming (2. ed.). Springer. isbn:978-3-642-15259-7 https://doi.org/10.1007/978-3-642-15260-3
Google Scholar
Cross Ref
- Christian Cachin and Björn Tackmann. 2019. Asymmetric Distributed Trust. In 23rd International Conference on Principles of Distributed Systems, OPODIS 2019, December 17-19, 2019, Neuchâtel, Switzerland, Pascal Felber, Roy Friedman, Seth Gilbert, and Avery Miller (Eds.) (LIPIcs, Vol. 153). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 7:1–7:16. https://doi.org/10.4230/LIPIcs.OPODIS.2019.7
Google Scholar
Cross Ref
- Christian Cachin and Marko Vukolic. 2017. Blockchain Consensus Protocols in the Wild. CoRR, abs/1707.01873 (2017), arXiv:1707.01873. arxiv:1707.01873
Google Scholar
- Christian Cachin and Luca Zanolini. 2021. Asymmetric Asynchronous Byzantine Consensus. In Data Privacy Management, Cryptocurrencies and Blockchain Technology - ESORICS 2021 International Workshops, DPM 2021 and CBT 2021, Darmstadt, Germany, October 8, 2021, Revised Selected Papers, Joaquín García-Alfaro, Jose Luis Muñoz-Tapia, Guillermo Navarro-Arribas, and Miguel Soriano (Eds.) (Lecture Notes in Computer Science, Vol. 13140). Springer, 192–207. https://doi.org/10.1007/978-3-030-93944-1_13
Google Scholar
Digital Library
- Johnny Cao.. 2020. A Practical Byzantine Fault Tolerance (PBFT) emulator built in Java. http://github.com/caojohnny/pbft-java
Google Scholar
- Miguel Castro and Barbara Liskov. 1999. Practical Byzantine Fault Tolerance. In Proceedings of the Third USENIX Symposium on Operating Systems Design and Implementation (OSDI), New Orleans, Louisiana, USA, February 22-25, 1999, Margo I. Seltzer and Paul J. Leach (Eds.). USENIX Association, 173–186. https://dl.acm.org/citation.cfm?id=296824
Google Scholar
Digital Library
- Bernadette Charron-Bost and André Schiper. 2009. The Heard-Of model: computing in distributed systems with benign faults. Distributed Comput., 22, 1 (2009), 49–71. https://doi.org/10.1007/s00446-009-0084-6
Google Scholar
Digital Library
- Brad Chase and Ethan MacBrough. 2018. Analysis of the XRP Ledger Consensus Protocol. CoRR, abs/1802.07242 (2018), arXiv:1802.07242. arxiv:1802.07242
Google Scholar
- Haicheng Chen, Wensheng Dou, Dong Wang, and Feng Qin. 2020. CoFI: Consistency-Guided Fault Injection for Cloud Systems. In 35th IEEE/ACM International Conference on Automated Software Engineering, ASE 2020, Melbourne, Australia, September 21-25, 2020. IEEE, 536–547. https://doi.org/10.1145/3324884.3416548
Google Scholar
Digital Library
- Allen Clement, Edmund L. Wong, Lorenzo Alvisi, Michael Dahlin, and Mirco Marchetti. 2009. Making Byzantine Fault Tolerant Systems Tolerate Byzantine Faults. In Proceedings of the 6th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2009, April 22-24, 2009, Boston, MA, USA, Jennifer Rexford and Emin Gün Sirer (Eds.). USENIX Association, 153–168. http://www.usenix.org/events/nsdi09/tech/full_papers/clement/clement.pdf
Google Scholar
- Domenico Cotroneo, Luigi De Simone, and Roberto Natella. 2022. ThorFI: a Novel Approach for Network Fault Injection as a Service. J. Netw. Comput. Appl., 201 (2022), 103334. https://doi.org/10.1016/j.jnca.2022.103334
Google Scholar
Digital Library
- Andrei Damian, Cezara Dragoi, Alexandru Militaru, and Josef Widder. 2019. Communication-Closed Asynchronous Protocols. In Computer Aided Verification - 31st International Conference, CAV 2019, New York City, NY, USA, July 15-18, 2019, Proceedings, Part II, Isil Dillig and Serdar Tasiran (Eds.) (Lecture Notes in Computer Science, Vol. 11562). Springer, 344–363. https://doi.org/10.1007/978-3-030-25543-5_20
Google Scholar
Cross Ref
- Pantazis Deligiannis, Matt McCutchen, Paul Thomson, Shuo Chen, Alastair F. Donaldson, John Erickson, Cheng Huang, Akash Lal, Rashmi Mudduluru, Shaz Qadeer, and Wolfram Schulte. 2016. Uncovering Bugs in Distributed Storage Systems during Testing (Not in Production!). In 14th USENIX Conference on File and Storage Technologies, FAST 2016, Santa Clara, CA, USA, February 22-25, 2016, Angela Demke Brown and Florentina I. Popovici (Eds.). USENIX Association, 249–262. https://www.usenix.org/conference/fast16/technical-sessions/presentation/deligiannis
Google Scholar
- Alan J. Demers, Daniel H. Greene, Carl Hauser, Wes Irish, John Larson, Scott Shenker, Howard E. Sturgis, Daniel C. Swinehart, and Douglas B. Terry. 1987. Epidemic Algorithms for Replicated Database Maintenance. In Proceedings of the Sixth Annual ACM Symposium on Principles of Distributed Computing, Vancouver, British Columbia, Canada, August 10-12, 1987, Fred B. Schneider (Ed.). ACM, 1–12. https://doi.org/10.1145/41840.41841
Google Scholar
Digital Library
- Diem. 2021. Diem Core - network tests. https://github.com/diem/diem/blob/main/consensus/src/network_tests.rs
Google Scholar
- Tien Tuan Anh Dinh, Ji Wang, Gang Chen, Rui Liu, Beng Chin Ooi, and Kian-Lee Tan. 2017. BLOCKBENCH: A Framework for Analyzing Private Blockchains. In Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, May 14-19, 2017, Semih Salihoglu, Wenchao Zhou, Rada Chirkova, Jun Yang, and Dan Suciu (Eds.). ACM, 1085–1100. https://doi.org/10.1145/3035918.3064033
Google Scholar
Digital Library
- Cezara Dragoi, Constantin Enea, Burcu Kulahcioglu Ozkan, Rupak Majumdar, and Filip Niksic. 2020. Testing consensus implementations using communication closure. Proc. ACM Program. Lang., 4, OOPSLA (2020), 210:1–210:29. https://doi.org/10.1145/3428278
Google Scholar
Digital Library
- Cezara Dragoi, Thomas A. Henzinger, Helmut Veith, Josef Widder, and Damien Zufferey. 2014. A Logic-Based Framework for Verifying Consensus Algorithms. In Verification, Model Checking, and Abstract Interpretation - 15th International Conference, VMCAI 2014, San Diego, CA, USA, January 19-21, 2014, Proceedings, Kenneth L. McMillan and Xavier Rival (Eds.) (Lecture Notes in Computer Science, Vol. 8318). Springer, 161–181. https://doi.org/10.1007/978-3-642-54013-4_10
Google Scholar
Digital Library
- Cynthia Dwork, Nancy A. Lynch, and Larry J. Stockmeyer. 1988. Consensus in the presence of partial synchrony. J. ACM, 35, 2 (1988), 288–323. https://doi.org/10.1145/42282.42283
Google Scholar
Digital Library
- Tzilla Elrad and Nissim Francez. 1982. Decomposition of Distributed Programs into Communication-Closed Layers. Sci. Comput. Program., 2, 3 (1982), 155–173. https://doi.org/10.1016/0167-6423(83)90013-8
Google Scholar
Cross Ref
- Michael Emmi, Shaz Qadeer, and Zvonimir Rakamaric. 2011. Delay-bounded scheduling. In Proceedings of the 38th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2011, Austin, TX, USA, January 26-28, 2011, Thomas Ball and Mooly Sagiv (Eds.). ACM, 411–422. https://doi.org/10.1145/1926385.1926432
Google Scholar
Digital Library
- Michael J. Fischer, Nancy A. Lynch, and Mike Paterson. 1985. Impossibility of Distributed Consensus with One Faulty Process. J. ACM, 32, 2 (1985), 374–382. https://doi.org/10.1145/3149.214121
Google Scholar
Digital Library
- Seth Gilbert and Nancy A. Lynch. 2002. Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. SIGACT News, 33, 2 (2002), 51–59. https://doi.org/10.1145/564585.564601
Google Scholar
Digital Library
- GitHub. 2020. Twins Simulator. https://github.com/asonnino/twins-simulator/tree/master/fhs
Google Scholar
- GitHub. 2022. HotStuff. https://github.com/relab/hotstuff
Google Scholar
- Patrice Godefroid. 1996. Partial-Order Methods for the Verification of Concurrent Systems - An Approach to the State-Explosion Problem (Lecture Notes in Computer Science, Vol. 1032). Springer. isbn:3-540-60761-7 https://doi.org/10.1007/3-540-60761-7
Google Scholar
Cross Ref
- Haryadi S Gunawi, Thanh Do, Joseph M Hellerstein, Ion Stoica, Dhruba Borthakur, and Jesse Robbins. 2011. Failure as a service (FAAS): A cloud service for large-scale, online failure drills. University of California, Berkeley, Berkeley, 3 (2011).
Google Scholar
- Haryadi S. Gunawi, Thanh Do, Pallavi Joshi, Peter Alvaro, Joseph M. Hellerstein, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Koushik Sen, and Dhruba Borthakur. 2011. FATE and DESTINI: A Framework for Cloud Recovery Testing. In Proceedings of the 8th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2011, Boston, MA, USA, March 30 - April 1, 2011, David G. Andersen and Sylvia Ratnasamy (Eds.). USENIX Association. https://www.usenix.org/conference/nsdi11/fate-and-destini-framework-cloud-recovery-testing
Google Scholar
- Divya Gupta, Lucas Perronne, and Sara Bouchenak. 2016. BFT-Bench: Towards a Practical Evaluation of Robustness and Effectiveness of BFT Protocols. In Distributed Applications and Interoperable Systems - 16th IFIP WG 6.1 International Conference, DAIS 2016, Held as Part of the 11th International Federated Conference on Distributed Computing Techniques, DisCoTec 2016, Heraklion, Crete, Greece, June 6-9, 2016, Proceedings, Márk Jelasity and Evangelia Kalyvianaki (Eds.) (Lecture Notes in Computer Science, Vol. 9687). Springer, 115–128. https://doi.org/10.1007/978-3-319-39577-7_10
Google Scholar
Digital Library
- Apache Hadoop.. 2009. Fault Injection Framework and Development Guide. https://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-hdfs/FaultInjectFramework.html
Google Scholar
- Raluca Halalai, Thomas A. Henzinger, and Vasu Singh. 2011. Quantitative Evaluation of BFT Protocols. In Eighth International Conference on Quantitative Evaluation of Systems, QEST 2011, Aachen, Germany, 5-8 September, 2011. IEEE Computer Society, 255–264. https://doi.org/10.1109/QEST.2011.40
Google Scholar
Digital Library
- Yury Izrailevsky and Ariel Tseitlin. 2011. The Netflix simian army. The Netflix Tech Blog.
Google Scholar
- Daniel Jackson and Craig Damon. 1996. Elements of Style: Analyzing a Software Design Feature with a Counterexample Detector. IEEE Trans. Software Eng., 22, 7 (1996), 484–495. https://doi.org/10.1109/32.538605
Google Scholar
Digital Library
- Pallavi Joshi, Haryadi S. Gunawi, and Koushik Sen. 2011. PREFAIL: a programmable tool for multiple-failure injection. In Proceedings of the 26th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2011, part of SPLASH 2011, Portland, OR, USA, October 22 - 27, 2011, Cristina Videira Lopes and Kathleen Fisher (Eds.). ACM, 171–188. https://doi.org/10.1145/2048066.2048082
Google Scholar
Digital Library
- Charles Edwin Killian, James W. Anderson, Ranjit Jhala, and Amin Vahdat. 2007. Life, Death, and the Critical Transition: Finding Liveness Bugs in Systems Code (Awarded Best Paper). In 4th Symposium on Networked Systems Design and Implementation NSDI (2007), April 11-13, 2007, Cambridge, Massachusetts, USA, Proceedings, Hari Balakrishnan and Peter Druschel (Eds.). USENIX. http://www.usenix.org/events/nsdi07/tech/killian.html
Google Scholar
- Minjeong Kim, Yujin Kwon, and Yongdae Kim. 2019. Is Stellar As Secure As You Think? In 2019 IEEE European Symposium on Security and Privacy Workshops, EuroS&P Workshops 2019, Stockholm, Sweden, June 17-19, 2019. IEEE, 377–385. https://doi.org/10.1109/EuroSPW.2019.00048
Google Scholar
Cross Ref
- Kyle Kingsbury.. 2022. Jepsen. http://jepsen.io/
Google Scholar
- Burcu Kulahcioglu Ozkan, Rupak Majumdar, Filip Niksic, Mitra Tabaei Befrouei, and Georg Weissenbacher. 2018. Randomized testing of distributed systems with probabilistic guarantees. Proc. ACM Program. Lang., 2, OOPSLA (2018), 160:1–160:28. https://doi.org/10.1145/3276530
Google Scholar
Digital Library
- Leslie Lamport, Robert E. Shostak, and Marshall C. Pease. 1982. The Byzantine Generals Problem. ACM Trans. Program. Lang. Syst., 4, 3 (1982), 382–401. https://doi.org/10.1145/357172.357176
Google Scholar
Digital Library
- Hyojeong Lee, Jeff Seibert, Md. Endadul Hoque, Charles Edwin Killian, and Cristina Nita-Rotaru. 2014. Turret: A Platform for Automated Attack Finding in Unmodified Distributed System Implementations. In IEEE 34th International Conference on Distributed Computing Systems, ICDCS 2014, Madrid, Spain, June 30 - July 3, 2014. IEEE Computer Society, 660–669. https://doi.org/10.1109/ICDCS.2014.73
Google Scholar
Digital Library
- Tanakorn Leesatapornwongsa, Mingzhe Hao, Pallavi Joshi, Jeffrey F. Lukman, and Haryadi S. Gunawi. 2014. SAMC: Semantic-Aware Model Checking for Fast Discovery of Deep Bugs in Cloud Systems. In 11th USENIX Symposium on Operating Systems Design and Implementation, OSDI ’14, Broomfield, CO, USA, October 6-8, 2014, Jason Flinn and Hank Levy (Eds.). USENIX Association, 399–414. https://www.usenix.org/conference/osdi14/technical-sessions/presentation/leesatapornwongsa
Google Scholar
- Yishuai Li, Benjamin C. Pierce, and Steve Zdancewic. 2021. Model-based testing of networked applications. In ISSTA ’21: 30th ACM SIGSOFT International Symposium on Software Testing and Analysis, Virtual Event, Denmark, July 11-17, 2021, Cristian Cadar and Xiangyu Zhang (Eds.). ACM, 529–539. https://doi.org/10.1145/3460319.3464798
Google Scholar
Digital Library
- Haopeng Liu, Xu Wang, Guangpu Li, Shan Lu, Feng Ye, and Chen Tian. 2018. FCatch: Automatically Detecting Time-of-fault Bugs in Cloud Systems. In Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2018, Williamsburg, VA, USA, March 24-28, 2018, Xipeng Shen, James Tuck, Ricardo Bianchini, and Vivek Sarkar (Eds.). ACM, 419–431. https://doi.org/10.1145/3173162.3177161
Google Scholar
Digital Library
- Marta Lokhava, Giuliano Losa, David Mazières, Graydon Hoare, Nicolas Barry, Eli Gafni, Jonathan Jove, Rafal Malinowsky, and Jed McCaleb. 2019. Fast and secure global payments with Stellar. In Proceedings of the 27th ACM Symposium on Operating Systems Principles, SOSP 2019, Huntsville, ON, Canada, October 27-30, 2019, Tim Brecht and Carey Williamson (Eds.). ACM, 80–96. https://doi.org/10.1145/3341301.3359636
Google Scholar
Digital Library
- Jie Lu, Chen Liu, Lian Li, Xiaobing Feng, Feng Tan, Jun Yang, and Liang You. 2019. CrashTuner: detecting crash-recovery bugs in cloud systems via meta-info analysis. In Proceedings of the 27th ACM Symposium on Operating Systems Principles, SOSP 2019, Huntsville, ON, Canada, October 27-30, 2019, Tim Brecht and Carey Williamson (Eds.). ACM, 114–130. https://doi.org/10.1145/3341301.3359645
Google Scholar
Digital Library
- Rupak Majumdar and Filip Niksic. 2018. Why is random testing effective for partition tolerance bugs? Proc. ACM Program. Lang., 2, POPL (2018), 46:1–46:24. https://doi.org/10.1145/3158134
Google Scholar
Digital Library
- Toufik Mansour. 2013. Combinatorics of set partitions. CRC Press Boca Raton.
Google Scholar
- Rolando Martins, Rajeev Gandhi, Priya Narasimhan, Soila M. Pertet, António Casimiro, Diego Kreutz, and Paulo Veríssimo. 2013. Experiences with Fault-Injection in a Byzantine Fault-Tolerant Protocol. In Middleware 2013 - ACM/IFIP/USENIX 14th International Middleware Conference, Beijing, China, December 9-13, 2013, Proceedings, David M. Eyers and Karsten Schwan (Eds.) (Lecture Notes in Computer Science, Vol. 8275). Springer, 41–61. https://doi.org/10.1007/978-3-642-45065-5_3
Google Scholar
Cross Ref
- Lara Mauri, Stelvio Cimato, and Ernesto Damiani. 2020. A Formal Approach for the Analysis of the XRP Ledger Consensus Protocol. In Proceedings of the 6th International Conference on Information Systems Security and Privacy, ICISSP 2020, Valletta, Malta, February 25-27, 2020, Steven Furnell, Paolo Mori, Edgar R. Weippl, and Olivier Camp (Eds.). SCITEPRESS, 52–63. https://doi.org/10.5220/0008954200520063
Google Scholar
Cross Ref
- David Mazieres. 2015. The Stellar Consensus Protocol: A federated model for internet-level consensus. Stellar Development Foundation, 32 (2015).
Google Scholar
- Christopher S. Meiklejohn, Andrea Estrada, Yiwen Song, Heather Miller, and Rohan Padhye. 2021. Service-Level Fault Injection Testing. In SoCC ’21: ACM Symposium on Cloud Computing, Seattle, WA, USA, November 1 - 4, 2021, Carlo Curino, Georgia Koutrika, and Ravi Netravali (Eds.). ACM, 388–402. https://doi.org/10.1145/3472883.3487005
Google Scholar
Digital Library
- Atsuki Momose. 2019. Force-Locking Attack on Sync Hotstuff. IACR Cryptol. ePrint Arch., 1484. https://eprint.iacr.org/2019/1484
Google Scholar
- Yoram Moses and Sergio Rajsbaum. 2002. A Layered Analysis of Consensus. SIAM J. Comput., 31, 4 (2002), 989–1021. https://doi.org/10.1137/S0097539799364006
Google Scholar
Digital Library
- Madanlal Musuvathi and Shaz Qadeer. 2007. Iterative context bounding for systematic testing of multithreaded programs. In Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, San Diego, California, USA, June 10-13, 2007, Jeanne Ferrante and Kathryn S. McKinley (Eds.). ACM, 446–455. https://doi.org/10.1145/1250734.1250785
Google Scholar
Digital Library
- Srinidhi Nagendra. 2022. Netrix. https://netrixframework.github.io/
Google Scholar
- Filip Niksic. 2019. Combinatorial Constructions for Effective Testing. Ph. D. Dissertation. Technische Universität Kaiserslautern.
Google Scholar
- Rohan Padhye, Caroline Lemieux, Koushik Sen, Mike Papadakis, and Yves Le Traon. 2019. Semantic fuzzing with zest. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2019, Beijing, China, July 15-19, 2019, Dongmei Zhang and Anders Møller (Eds.). ACM, 329–340. https://doi.org/10.1145/3293882.3330576
Google Scholar
Digital Library
- Soyeon Park, Wen Xu, Insu Yun, Daehee Jang, and Taesoo Kim. 2020. Fuzzing JavaScript Engines with Aspect-preserving Mutation. In 2020 IEEE Symposium on Security and Privacy, SP 2020, San Francisco, CA, USA, May 18-21, 2020. IEEE, 1629–1642. https://doi.org/10.1109/SP40000.2020.00067
Google Scholar
Cross Ref
- Van-Thuan Pham, Marcel Böhme, and Abhik Roychoudhury. 2020. AFLNET: A Greybox Fuzzer for Network Protocols. In 13th IEEE International Conference on Software Testing, Validation and Verification, ICST 2020, Porto, Portugal, October 24-28, 2020. IEEE, 460–465. https://doi.org/10.1109/ICST46399.2020.00062
Google Scholar
Cross Ref
- Van-Thuan Pham, Marcel Böhme, Andrew E. Santosa, Alexandru Razvan Caciulescu, and Abhik Roychoudhury. 2021. Smart Greybox Fuzzing. IEEE Trans. Software Eng., 47, 9 (2021), 1980–1997. https://doi.org/10.1109/TSE.2019.2941681
Google Scholar
Cross Ref
- Prashant Pogde, Siddharth Wagle, and Basavaraj M.. 2020. Apache Ozone Fault Injection Framework. https://blog.cloudera.com/apache-ozone-fault-injection-framework/
Google Scholar
- Casey Rosenthal. 2017. Principles of Chaos Engineering. USENIX Association, San Francisco, CA.
Google Scholar
- David Schwartz, Noah Youngs, and Arthur Britto. 2014. The Ripple Protocol Consensus Algorithm. Ripple Labs Inc White Paper, 5, 8 (2014), 151.
Google Scholar
- Atul Singh, Tathagata Das, Petros Maniatis, Peter Druschel, and Timothy Roscoe. 2008. BFT Protocols Under Fire. In 5th USENIX Symposium on Networked Systems Design & Implementation, NSDI 2008, April 16-18, 2008, San Francisco, CA, USA, Proceedings, Jon Crowcroft and Michael Dahlin (Eds.). USENIX Association, 189–204. http://www.usenix.org/events/nsdi08/tech/full_papers/singh/singh.pdf
Google Scholar
Digital Library
- João Soares, Ricardo Fernandez, Miguel Silva, Tadeu Freitas, and Rolando Martins. 2021. ZERMIA - A Fault Injector Framework for Testing Byzantine Fault Tolerant Protocols. In Network and System Security - 15th International Conference, NSS 2021, Tianjin, China, October 23, 2021, Proceedings, Min Yang, Chao Chen, and Yang Liu (Eds.) (Lecture Notes in Computer Science, Vol. 13041). Springer, 38–60. https://doi.org/10.1007/978-3-030-92708-0_3
Google Scholar
Digital Library
- A. J. Stam. 1983. Generation of a Random Partition of a Finite Set by an Urn Model. J. Comb. Theory, Ser. A, 35, 2 (1983), 231–240. https://doi.org/10.1016/0097-3165(83)90009-2
Google Scholar
Cross Ref
- Xudong Sun, Wenqing Luo, Jiawei Tyler Gu, Aishwarya Ganesan, Ramnatthan Alagappan, Michael Gasch, Lalith Suresh, and Tianyin Xu. 2022. Automatic Reliability Testing For Cluster Management Controllers. In 16th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2022, Carlsbad, CA, USA, July 11-13, 2022, Marcos K. Aguilera and Hakim Weatherspoon (Eds.). USENIX Association, 143–159. https://www.usenix.org/conference/osdi22/presentation/sun
Google Scholar
- Tendermint. 2021. Tendermint: Tendermint Core (BFT Consensus) in Go (v0.34.7). https://github.com/tendermint/tendermint/tree/v0.34.7
Google Scholar
- Tatsuhiro Tsuchiya and André Schiper. 2011. Verification of consensus algorithms using satisfiability solving. Distributed Comput., 23, 5-6 (2011), 341–358. https://doi.org/10.1007/s00446-010-0123-3
Google Scholar
Digital Library
- Martijn van Meerten, Burcu Kulahcioglu Ozkan, and Annibale Panichella. 2023. Evolutionary Approach for Concurrency Testing of Ripple Blockchain Consensus Algorithm. In 45th IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, ICSE (SEIP) 2023, Melbourne, Australia, May 14-20, 2023. IEEE. to appear
Google Scholar
- Levin N. Winter, Florena Buse, Daan de Graaf, Klaus von Gleissenthall, and Burcu Kulahcioglu Ozkan. 2023. Randomized Testing of Byzantine Fault Tolerant Consensus Algorithms (artifact). https://zenodo.org/record/7510752
Google Scholar
- XRPLF. 2021. Decentralized cryptocurrency blockchain daemon implementing the XRP Ledger in C++ (v.1.7.2). https://github.com/XRPLF/rippled/tree/1.7.2
Google Scholar
- Junfeng Yang, Tisheng Chen, Ming Wu, Zhilei Xu, Xuezheng Liu, Haoxiang Lin, Mao Yang, Fan Long, Lintao Zhang, and Lidong Zhou. 2009. MODIST: Transparent Model Checking of Unmodified Distributed Systems. In Proceedings of the 6th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2009, April 22-24, 2009, Boston, MA, USA, Jennifer Rexford and Emin Gün Sirer (Eds.). USENIX Association, 213–228. http://www.usenix.org/events/nsdi09/tech/full_papers/yang/yang.pdf
Google Scholar
- Qian Zhang, Jiyuan Wang, Muhammad Ali Gulzar, Rohan Padhye, and Miryung Kim. 2020. BigFuzz: Efficient Fuzz Testing for Data Analytics Using Framework Abstraction. In 35th IEEE/ACM International Conference on Automated Software Engineering, ASE 2020, Melbourne, Australia, September 21-25, 2020. IEEE, 722–733. https://doi.org/10.1145/3324884.3416641
Google Scholar
Digital Library
Index Terms
Randomized Testing of Byzantine Fault Tolerant Algorithms
Recommendations
Byzantine Fault-Tolerant MapReduce: Faults are Not Just Crashes
CLOUDCOM '11: Proceedings of the 2011 IEEE Third International Conference on Cloud Computing Technology and ScienceMapReduce is often used to run critical jobs such as scientific data analysis. However, evidence in the literature shows that arbitrary faults do occur and can probably corrupt the results of MapReduce jobs. MapReduce runtimes like Hadoop tolerate crash ...
Making Byzantine fault tolerant systems tolerate Byzantine faults
NSDI'09: Proceedings of the 6th USENIX symposium on Networked systems design and implementationThis paper argues for a new approach to building Byzantine fault tolerant replication systems. We observe that although recently developed BFT state machine replication protocols are quite fast, they don't tolerate Byzantine faults very well: a single ...
ZERMIA - A Fault Injector Framework for Testing Byzantine Fault Tolerant Protocols
Network and System SecurityAbstractByzantine fault tolerant (BFT) protocols are designed to increase system dependability and security. They guarantee liveness and correctness even in the presence of arbitrary faults. However, testing and validating BFT systems is not an easy task. ...






Comments