skip to main content
research-article

Decentralized Fault-Tolerant Event Correlation

Published:07 August 2014Publication History
Skip Abstract Section

Abstract

Despite the prognosed use of event correlation techniques for monitoring critical complex infrastructures or dealing with disasters in the physical world, little work exists on making event correlation systems themselves tolerant to failure. Existing systems either provide no guarantees on event deliveries, do not support multicast and thus provide no guarantees across individual processes, or then rely on centralized components or strong assumptions on the infrastructure.

The FAIDECS system attempts to reconcile strong guarantees with practical performance in the presence of process crash failures. To that end, the FAIDECS system uses an overlay network with specific guarantees aligned with its proposed correlation language and guarantees. However, the language proposed lacks expressivity, and the system itself supports only very specific rigid semantics, incapable of supporting even fundamental features like sliding windows.

After providing a comprehensive overview of the FAIDECS model and system, this article bridges the gap between strong guarantees and more established correlation languages and systems in several steps. First, we propose alternative semantics for several modules of the FAIDECS matching engine and revisit guarantees. Second, we pinpoint which guarantees are contradicted by which combinations of semantic options. Third, we investigate four correlation languages—StreamSQL, EQL, CEL, and TESLA—showing which semantic options their respective features correspond to in our model, and thus, ultimately, which guarantees of FAIDECS are maintained by which language features.

References

  1. Daniel J. Abadi, Don Carney, Ugur Çetintemel, Mitch Cherniack, Christian Convey, Sangdon Lee, Michael Stonebraker, Nesime Tatbul, and Stan Zdonik. 2003. Aurora: A new model and architecture for data stream management. VLDB J. 12, 2 (Aug. 2003), 120--139. DOI: http://dx.doi.org/10.1007/s00778-003-0095-z Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Marcos K. Aguilera, Robert E. Strom, Daniel C. Sturman, Mark Astley, and Tushar D. Chandra. 1999. Matching events in a content-based subscription system. In Proceedings of the 18th Annual ACM Symposium on Principles of Distributed Computing (PODC '99). ACM, New York, NY, 53--61. DOI: http://dx.doi.org/10.1145/301308.301326 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Marcos Kawazoe Aguilera and Sam Toueg. 1996. Randomization and failure detection: A hybrid approach to solve consensus. In Proceedings of the 10th International Workshop on Distributed Algorithms (WDAG'96). Lecture Notes in Computer Science, vol. 1151, Springer-Verlag, Berlin, Heidelberg, 29--39. http://dl.acm.org/citation.cfm?id=645953.675629 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Magdalena Balazinska, Hari Balakrishnan, Samuel R. Madden, and Michael Stonebraker. 2008. Fault-tolerance in the Borealis distributed stream processing system. ACM Trans. Data. Syst. 33, 1, Article 3 (2008). DOI: http://dx.doi.org/10.1145/1331904.1331907 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Roberto Baldoni, Silvia Bonomi, Marco Platania, and Leonardo Querzoni. 2012. Dynamic message ordering for topic-based publish/subscribe systems. In Proceedings of the IEEE 26th International Parallel and Distributed Processing Symposium (IPDPS'12). IEEE Computer Society, 909--920. DOI: http://dx.doi.org/10.1109/IPDPS.2012.86 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Anindya Basu, Bernadette Charron-Bost, and Sam Toueg. 1996. Simulating reliable links with unreliable links in the presence of process crashes. In Proceedings of the 10th International Workshop on Distributed Algorithms (WDAG'96). Lecture Notes in Computer Science, vol. 1151. Springer-Verlag, Berlin, Heidelberg, 105--122. http://dl.acm.org/citation.cfm?id=645953.675641 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Lars Brenna, Alan Demers, Johannes Gehrke, Mingsheng Hong, Joel Ossher, Biswanath Panda, Mirek Riedewald, Mohit Thatte, and Walker White. 2007. Cayuga: A high-performance event processing engine. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'07). ACM, New York, NY, 1100--1102. DOI: http://dx.doi.org/10.1145/1247480.1247620 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Antonio Carzaniga, David S. Rosenblum, and Alexander L. Wolf. 2001. Design and evaluation of a wide-area event notification service. ACM Trans. Comput. Syst. 19, 3 (2001), 332--383. DOI: http://dx.doi.org/10.1145/380749.380767 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Sharma Chakravarthy, V. Krishnaprasad, Eman Anwar, and S.-K. Kim. 1994. Composite events for active databases: Semantics, contexts and detection. In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB'94). Morgan Kaufmann Publishers Inc., San Francisco, CA, 606--617. http://dl.acm.org/citation.cfm?id=645920.672994 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Tushar Deepak Chandra and Sam Toueg. 1996. Unreliable failure detectors for reliable distributed systems. J. ACM 43, 2 (1996), 225--267. DOI: http://dx.doi.org/10.1145/226643.226647 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Gianpaolo Cugola and Alessandro Margara. 2010. TESLA: A formally defined event specification language. In Proceedings of the 4th ACM International Conference on Distributed Event-Based Systems (DEBS'10). ACM, New York, NY, 50--61. DOI: http://dx.doi.org/10.1145/1827418.1827427 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Xavier Défago, André Schiper, and Péter Urbán. 2004. Total order broadcast and multicast algorithms: Taxonomy and survey. ACM Comput. Surv. 36, 4 (2004), 372--421. DOI: http://dx.doi.org/10.1145/1041680.1041682 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Alan Demers, Johannes Gehrke, Mingsheng Hong, Mirek Riedewald, and Walker White. 2006. Towards expressive publish/subscribe systems. In Proceedings of the 10th International Conference on Advances in Database Technology (EDBT'06). Lecture Notes in Computer Science, vol. 1151, Springer-Verlag, Berlin, Heidelberg, 627--644. DOI: http://dx.doi.org/10.1007/11687238_38 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Michael J. Fischer, Nancy A. Lynch, and Michael S. Paterson. 1985. Impossibility of distributed consensus with one faulty process. J. ACM 32, 2 (1985), 374--382. DOI: http://dx.doi.org/10.1145/3149.214121 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. Gatziu and K. R. Dittrich. 1994. Detecting composite events in active database systems using Petri nets. In Proceedings of the 4th International Workshop on Research Issues in Data Engineering. Active Database Systems. 2--9. DOI: http://dx.doi.org/10.1109/RIDE.1994.282859 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Narain H. Gehani, H. V. Jagadish, and Oded Shmueli. 1992. Composite event specification in active databases: Model &Amp; implementation. In Proceedings of the 18th International Conference on Very Large Data Bases (VLDB'92). Morgan Kaufmann Publishers Inc., San Francisco, CA, 327--338. http://dl.acm.org/citation.cfm?id=645918.672484 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Vassos Hadzilacos and Sam Toueg. 1993. Fault-tolerant broadcasts and related problems. Distributed Systems (2nd Ed.) ACM Press/Addison-Wesley Publishing Co., New York, NY. 97--145. http://dl.acm.org/citation.cfm?id=302430.302435 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Waldemar Hummer, Christian Inzinger, Philipp Leitner, Benjamin Satzger, and Schahram Dustdar. 2012. Deriving a unified fault taxonomy for event-based systems. In Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems (DEBS'12). ACM, New York, NY, 167--178. DOI: http://dx.doi.org/10.1145/2335484.2335504 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Gabriela Jacques-Silva, Jim Challenger, Lou Degenaro, James Giles, and Rohit Wagle. 2007. Towards autonomic fault recovery in System-S. In Proceedings of the 4th International Conference on Autonomic Computing (ICAC'07). IEEE Computer Society, 31--. DOI: http://dx.doi.org/10.1109/ICAC.2007.40 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Namit Jain, Shailendra Mishra, Anand Srinivasan, Johannes Gehrke, Jennifer Widom, Hari Balakrishnan, Uǧur Çetintemel, Mitch Cherniack, Richard Tibbetts, and Stan Zdonik. 2008. Towards a streaming SQL standard. Proc. VLDB Endow. 1, 2 (2008), 1379--1390. DOI: http://dx.doi.org/10.14778/1454159.1454179 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Ramana Rao Kompella, Jennifer Yates, Albert Greenberg, and Alex C. Snoeren. 2005. IP fault localization via risk modeling. In Proceedings of the 2nd Conference on Symposium on Networked Systems Design & Implementation (NSDI'05). USENIX Association, Berkeley, CA, USA, 57--70. http://dl.acm.org/citation.cfm??id=1251203.1251208 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Christopher Krügel, Thomas Toth, and Clemens Kerer. 2002. Decentralized event correlation for intrusion detection. In Proceedings of the 4th International Conference Seoul on Information Security and Cryptology (ICISC'01). Lecture Notes in Computer Science, vol. 1151, Springer-Verlag, Berlin, Heidelberg, 114--131. http://dl.acm.org/citation.cfm?id=646283.687988 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Guoli Li and Hans-Arno Jacobsen. 2005. Composite subscriptions in content-based publish/subscribe systems. In Proceedings of the ACM/IFIP/USENIX International Conference on Middleware (Middleware'05). Lecture Notes in Computer Science, vol. 1151, Springer-Verlag, Berlin, Heidelberg, 249--269. http://dl.acm.org/citation.cfm?id=1515890.1515903 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Cristian Lumezanu, Neil Spring, and Bobby Bhattacharjee. 2006. Decentralized message ordering for publish/subscribe systems. In Proceedings of the ACM/IFIP/USENIX International Conference on Middleware (Middleware'06). Lecture Notes in Computer Science, vol. 1151, Springer-Verlag, Berlin, Heidelberg, 162--179. http://dl.acm.org/citation.cfm?id=1515984.1515997 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Peter R. Pietzuch, Brian Shand, and Jean Bacon. 2003. A framework for event composition in distributed systems. In Proceedings of the ACM/IFIP/USENIX International Conference on Middleware (Middleware'03). Lecture Notes in Computer Science, vol. 1151, Springer-Verlag, Berlin, Heidelberg, 62--82. http://dl.acm.org/citation.cfm?id=1515915.1515921 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Zhengping Qian, Yong He, Chunzhi Su, Zhuojie Wu, Hongyu Zhu, Taizhi Zhang, Lidong Zhou, Yuan Yu, and Zheng Zhang. 2013. TimeStream: Reliable stream computation in the Cloud. In Proceedings of the 8th ACM European Conference on Computer Systems (EuroSys'13). ACM, New York, NY, 1--14. DOI: http://dx.doi.org/10.1145/2465351.2465353 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Heiko Sturzrehm, Pascal Felber, and Christof Fetzer. 2009. TM-Stream: An STM framework for distributed event stream processing. In Proceedings of the IEEE International Symposium on Parallel Distributed Processing, (IPDPS'09). 1--8. DOI: http://dx.doi.org/10.1109/IPDPS.2009.5161084 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. P. Triantafillou and A. Economides. 2004. Subscription summarization: A new paradigm for efficient publish/subscribe systems. In Proceedings of the 24th International Conference on Distributed Computing Systems. 562--571. DOI: http://dx.doi.org/10.1109/ICDCS.2004.1281623 Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Gregory Aaron Wilkin and Patrick Eugster. 2013. Multicasting in the presence of aggregated deliveries. J. Parallel Distrib. Comput. 73, 4 (2013), 544--556. DOI: http://dx.doi.org/10.1016/j.jpdc.2012.12.004 Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Gregory Aaron Wilkin, Patrick Eugster, and K. R. Jayaram. 2014. Decentralized fault tolerant event-correlation. Technical Report. http://www.jayaramkr.com/files/FAIDECSTechReport.pdf.Google ScholarGoogle Scholar
  31. Gregory Aaron Wilkin, K. R. Jayaram, Patrick Eugster, and Ankur Khetrapal. 2011. FAIDECS: Fair decentralized event correlation. In Proceedings of the 12th ACM/IFIP/USENIX International Conference on Middleware (Middleware'11). Lecture Notes in Computer Science, vol. 1151, Springer-Verlag, Berlin, Heidelberg, 228--248. DOI: http://dx.doi.org/10.1007/978-3-642-25821-3_12 Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Kaiwen Zhang, Vinod Muthusamy, and Hans-Arno Jacobsen. 2012. Total order in content-based publish/subscribe systems. In Proceedings of the IEEE 32nd International Conference on Distributed Computing Systems (ICDCS). 335--344. DOI: http://dx.doi.org/10.1109/ICDCS.2012.17 Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Yuanyuan Zhao and Rob Strom. 2001. Exploitng event stream interpretation in publish-subscribe systems. In Proceedings of the 20th Annual ACM Symposium on Principles of Distributed Computing (PODC'01). ACM, New York, NY, 219--228. DOI: http://dx.doi.org/10.1145/383962.384023 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Decentralized Fault-Tolerant Event Correlation

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Internet Technology
        ACM Transactions on Internet Technology  Volume 14, Issue 1
        Special Issue on Event Recognition
        July 2014
        161 pages
        ISSN:1533-5399
        EISSN:1557-6051
        DOI:10.1145/2659232
        • Editor:
        • Munindar P. Singh
        Issue’s Table of Contents

        Copyright © 2014 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 7 August 2014
        • Accepted: 1 April 2014
        • Revised: 1 March 2014
        • Received: 1 October 2013
        Published in toit Volume 14, Issue 1

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!