10.1145/3064176.3064213acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

Hybrids on Steroids: SGX-Based High Performance BFT

Online:23 April 2017Publication History

ABSTRACT

With the advent of trusted execution environments provided by recent general purpose processors, a class of replication protocols has become more attractive than ever: Protocols based on a hybrid fault model are able to tolerate arbitrary faults yet reduce the costs significantly compared to their traditional Byzantine relatives by employing a small subsystem trusted to only fail by crashing. Unfortunately, existing proposals have their own price: We are not aware of any hybrid protocol that is backed by a comprehensive formal specification, complicating the reasoning about correctness and implications. Moreover, current protocols of that class have to be performed largely sequentially. Hence, they are not well-prepared for just the modern multi-core processors that bring their very own fault model to a broad audience. In this paper, we present Hybster, a new hybrid state-machine replication protocol that is highly parallelizable and specified formally. With over 1 million operations per second using only four cores, the evaluation of our Intel SGX-based prototype implementation shows that Hybster makes hybrid state-machine replication a viable option even for today's very demanding critical services.

References

  1. http://www.businessinsider.com/amazons-cloud-can-handle-1-million-transactions-per-second-2012-4.Google ScholarGoogle Scholar
  2. https://gigaom.com/2011/12/06/facebook-shares-some-secrets-on-making-mysql-scale.Google ScholarGoogle Scholar
  3. M. Abd-El-Malek, G. R. Ganger, G. R. Goodson, M. K. Reiter, and J. J. Wylie. Fault-scalable Byzantine fault-tolerant services. In Proceedings of the 20th Symposium on Operating Systems Principles (SOSP '05), pages 59--74, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Y. Amir, B. Coan, J. Kirsch, and J. Lane. Byzantine replication under attack. In Proceedings of the 38th International Conference on Dependable Systems and Networks (DSN '08), pages 197--206, 2008. Google ScholarGoogle ScholarCross RefCross Ref
  5. ARM. Security technology building a secure system using TrustZone technology (white paper). ARM Limited, 2009.Google ScholarGoogle Scholar
  6. P.-L. Aublin, S. B. Mokhtar, and V. Quéma. RBFT: Redundant Byzantine fault tolerance. In Proceedings of the 33rd International Conference on Distributed Computing Systems (ICDCS '13), pages 297--306, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Aumasson and L. Merino. SGX Secure Enclaves in Practice - Security and Crypto Review. https://www.blackhat.com/docs/us-16/materials/us-16-Aumasson-SGX-Secure-Enclaves-In-Practice-Security-And-Crypto-Review.pdf, 2016.Google ScholarGoogle Scholar
  8. J. Behl, T. Distler, and R. Kapitza. Hybster --- A highly parallelizable protocol for hybrid fault-tolerant service replication. http: //publikationsserver.tu-braunschweig.de/get/64440.Google ScholarGoogle Scholar
  9. J. Behl, T. Distler, and R. Kapitza. Consensus-oriented parallelization: How to earn your first million. In Proceedings of the 16th Middleware Conference (Middleware '15), pages 173--184, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Bessani, J. Sousa, and E. Alchieri. State machine replication for the masses with BFT-SMaRt. In Proceedings of the 44th International Conference on Dependable Systems and Networks (DSN '14), pages 355--362, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M.Castro. Practical Byzantine Fault-Tolerance. PhD thesis, MIT, 2001.Google ScholarGoogle Scholar
  12. M. Castro and B. Liskov. A correctness proof for a practical Byzantine-fault-tolerant replication algorithm. Technical report, Cambridge, MA, USA, 1999.Google ScholarGoogle Scholar
  13. M. Castro and B. Liskov. Practical Byzantine fault tolerance. In Proceedings of the 3rd Symposium on Operating Systems Design and Implementation (OSDI '99), pages 173--186, 1999.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Castro, R. Rodrigues, and B. Liskov. BASE: Using abstraction to improve fault tolerance. ACM Transactions on Computer Systems, 21(3):236--269, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. B.-G. Chun, P. Maniatis, S. Shenker, and J. Kubiatowicz. Attested append-only memory: Making adversaries stick to their word. In Proceedings of 21st Symposium on Operating Systems Principles (SOSP '07), pages 189--204, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Clement, E. Wong, L. Alvisi, M. Dahlin, and M. Marchetti. Making Byzantine fault tolerant systems tolerate Byzantine faults. In Proceedings of the 6th Symposium on Networked Systems Design and Implementation (NSDI '09), pages 153--168, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Correia, N. F. Neves, L. C. Lung, and P. Veríssimo. Worm-IT -- A wormhole-based intrusion-tolerant group communication system. Journal of Systems and Software, 80(2):178--197, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Correia, N. F. Neves, and P. Veríssimo. How to tolerate half less one Byzantine nodes in practical distributed systems. In Proceedings of the 23rd Symposium on Reliable Distributed Systems (SRDS '04), pages 174--183, 2004. Google ScholarGoogle ScholarCross RefCross Ref
  19. J. Cowling, D. Myers, B. Liskov, R. Rodrigues, and L. Shrira. HQ replication: A hybrid quorum protocol for Byzantine fault tolerance. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI '06), pages 177--190, 2006.Google ScholarGoogle Scholar
  20. T. Distler, C. Cachin, and R. Kapitza. Resource-efficient Byzantine fault tolerance. IEEE Transactions on Computers, 65(9):2807--2819, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. T. Distler and R. Kapitza. Increasing performance in Byzantine fault-tolerant systems with on-demand replica consistency. In Proceedings of the 6th European Conference on Computer Systems (EuroSys '11), pages 91--105, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. T. Distler, R. Kapitza, I. Popov, H. P. Reiser, and W. Schröder-Preikschat. SPARE: Replicas on hold. In Proceedings of the 18th Network and Distributed System Security Symposium (NDSS '11), pages 407--420, 2011.Google ScholarGoogle Scholar
  23. M. J. Fischer, N. A. Lynch, and M. S. Paterson. Impossibility of distributed consensus with one faulty process. Journal of the ACM, 32:374--382, Apr. 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. R. Guerraoui, N. Knežević, V. Quéma, and M. Vukolić. The next 700 BFT protocols. In Proceedings of the 5th European Conference on Computer Systems (EuroSys '10), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Z. Guo, C. Hong, M. Yang, D. Zhou, L. Zhou, and L. Zhuang. Rex: Replication at the speed of multi-core. In Proceedings of the 9th European Conference on Computer Systems (EuroSys '14), 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J. Hendricks, S. Sinnamohideen, G. R. Ganger, and M. K. Reiter. Zzyzx: Scalable fault tolerance through Byzantine locking. In Proceedings of the 40th International Conference on Dependable Systems and Networks (DSN '10), pages 363--372, 2010. Google ScholarGoogle ScholarCross RefCross Ref
  27. P. Hunt, M. Konar, F. P. Junqueira, and B. Reed. ZooKeeper: Wait-free coordination for Internet-scale systems. In Proceedings of the 2010 USENIX Annual Technical Conference (ATC '10), pages 145--158, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. R. Kapitza, J. Behl, C. Cachin, T. Distler, S. Kuhnle, S. V. Mohammadi, W. Schröder-Preikschat, and K. Stengel. Cheap-BFT: Resource-efficient Byzantine fault tolerance. In Proceedings of the 7th European Conference on Computer Systems (EuroSys '12), pages 295--308, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. M. Kapritsos, Y. Wang, V. Quéma, A. Clement, L. Alvisi, and M. Dahlin. All about Eve: Execute-verify replication for multi-core servers. In Proceedings of the 10th Symposium on Operating Systems Design and Implementation (OSDI '12), pages 237--250, 2012.Google ScholarGoogle Scholar
  30. R. Kotla, L. Alvisi, M. Dahlin, A. Clement, and E. Wong. Zyzzyva: Speculative Byzantine fault tolerance. In Proceedings of the 21st Symposium on Operating Systems Principles (SOSP '07), pages 45--58, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. R. Kotla and M. Dahlin. High throughput Byzantine fault tolerance. In Proceedings of the 34th International Conference on Dependable Systems and Networks (DSN '04), pages 575--584, 2004. Google ScholarGoogle ScholarCross RefCross Ref
  32. D. Levin, J. R. Douceur, J. R. Lorch, and T. Moscibroda. TrInc: Small trusted hardware for large distributed systems. In Proceedings of the 6th Symposium on Networked Systems Design and Implementation (NSDI '09), 2009.Google ScholarGoogle Scholar
  33. P. J. Marandi, C. E. Bezerra, and F. Pedone. Rethinking state-machine replication for parallelism. In Proceedings of the 34th International Conference on Distributed Computing Systems (ICDCS '14), pages 368--377, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. F. McKeen, I. Alexandrovich, A. Berenzon, C. V. Rozas, H. Shafi, V. Shanbhogue, and U. R. Savagaonkar. Innovative instructions and software model for isolated execution. In Proceedings of the 2nd Workshop on Hardware and Architectural Support for Security and Privacy (HASP '13), 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. A. Miller, Y. Xia, K. Croman, E. Shi, and D. Song. The honey badger of BFT protocols. In Proceedings of the 2016 Conference on Computer and Communications Security (CCS 16), pages 31--42, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. M. Pease, R. Shostak, and L. Lamport. Reaching agreement in the presence of faults. Journal of the ACM, 27(2):228--234, 1980. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. H. P. Reiser and R. Kapitza. Hypervisor-based efficient proactive recovery. In Proceedings of the 26th Symposium on Reliable Distributed Systems (SRDS '07), pages 83--92, 2007. Google ScholarGoogle ScholarCross RefCross Ref
  38. J. M. Rushby. Design and verification of secure systems. In Proceedings of the 8th Symposium on Operating Systems Principles (SOSP '81), pages 12--21, 1981. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. N. Santos and A. Schiper. Achieving high-throughput state machine replication in multi-core systems. In Proceedings of the 33rd International Conference on Distributed Computing Systems (ICDCS '13), pages 266--275, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. F. B. Schneider. Implementing fault-tolerant services using the state machine approach: A tutorial. ACM Computing Surveys, 22(4):299--319, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. J. Sousa and A. Bessani. From Byzantine consensus to BFT state machine replication: A latency-optimal transformation. In Proceedings of the 9th European Dependable Computing Conference (EDCC '12), pages 37--48, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. R. van Renesse, C. Ho, and N. Schiper. Byzantine chain replication. In Principles of Distributed Systems, pages 345--359. Springer, 2012. Google ScholarGoogle ScholarCross RefCross Ref
  43. G. S. Veronese, M. Correia, A. Bessani, and L. C. Lung. Spin one's wheels? Byzantine fault tolerance with a spinning primary. In Proceedings of the 28th Symposium on Reliable Distributed Systems (SRDS '09), pages 135--144, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. G. S. Veronese, M. Correia, A. N. Bessani, and L. C. Lung. EBAWA: Efficient Byzantine agreement for wide-area networks. In Proceedings of the 12th Symposium on High-Assurance Systems Engineering (HASE '10), pages 10--19, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. G. S. Veronese, M. Correia, A. N. Bessani, L. C. Lung, and P. Veríssimo. Efficient Byzantine fault-tolerance. IEEE Transactions on Computers, 62(1):16--30, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. M. Vukolić. The quest for scalable blockchain fabric: Proof-of-work vs. BFT replication. In IFIP WG 11.4 Workshop on Open Research Problems in Network Security (iNetSec '15), pages 112--125, 2015.Google ScholarGoogle Scholar
  47. T. Wood, R. Singh, A. Venkataramani, P. Shenoy, and E. Cecchet. ZZ and the art of practical BFT execution. In Proceedings of the 6th European Conference on Computer Systems (EuroSys '11), pages 123--138, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. J. Yin, J.-P. Martin, A. Venkataramani, L. Alvisi, and M. Dahlin. Separating agreement from execution for Byzantine fault tolerant services. In Proceedings of the 19th Symposium on Operating Systems Principles (SOSP '03), pages 253--267, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Hybrids on Steroids: SGX-Based High Performance BFT

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in
              • Published in

                ACM Conferences cover image
                EuroSys '17: Proceedings of the Twelfth European Conference on Computer Systems
                April 2017
                648 pages
                ISBN:9781450349383
                DOI:10.1145/3064176

                Copyright © 2017 ACM

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Online: 23 April 2017

                Permissions

                Request permissions about this article.

                Request Permissions

                Qualifiers

                • research-article
                • Research
                • Refereed limited

                Acceptance Rates

                Overall Acceptance Rate 110 of 657 submissions, 17%

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader
              About Cookies On This Site

              We use cookies to ensure that we give you the best experience on our website.

              Learn more

              Got it!