Abstract
Recognising patterns that correlate multiple events over time becomes increasingly important in applications that exploit the Internet of Things, reaching from urban transportation through surveillance monitoring to business workflows. In many real-world scenarios, however, timestamps of events may be erroneously recorded, and events may be dropped from a stream due to network failures or load shedding policies. In this work, we present SimpMatch, a novel simplex-based algorithm for probabilistic evaluation of event queries using constraints over event orderings in a stream. Our approach avoids learning probability distributions for time-points or occurrence intervals. Instead, we employ the abstraction of segmented intervals and compute the probability of a sequence of such segments using the notion of order statistics. The algorithm runs in linear time to the number of lost events and shows high accuracy, yielding exact results if event generation is based on a Poisson process and providing a good approximation otherwise. We demonstrate empirically that SimpMatch enables efficient and effective reasoning over event streams, outperforming state-of-the-art methods for probabilistic evaluation of event queries by up to two orders of magnitude.
- [n.d.]. CAVIAR project. Retrieved from http://homepages.inf.ed.ac.uk/rbf/CAVIAR/.Google Scholar
- [n.d.]. Dublinked. Retrieved from http://dublinked.ie/.Google Scholar
- [n.d.]. VaVeL European project. Retrieved from http://www.vavel-project.eu/.Google Scholar
- Gerhard Weikum. 2007. Proceedings of the 3rd Biennial Conference on Innovative Data Systems Research (CIDR’07).Google Scholar
- Jagrati Agrawal, Yanlei Diao, Daniel Gyllstrom, and Neil Immerman. 2008. Efficient pattern matching over event streams, See Reference [], 147--160.Google Scholar
- Mert Akdere, Ugur Çetintemel, and Nesime Tatbul. 2008. Plan-based complex event detection across distributed sources. Proc. VLDB 1, 1 (2008), 66--77.Google Scholar
Digital Library
- Elias Alevizos, Anastasios Skarlatidis, Alexander Artikis, and Georgios Paliouras. 2017. Probabilistic complex event recognition: A survey. ACM Comput. Surv. 50, 5 (2017), 71:1--71:31. DOI:https://doi.org/10.1145/3117809Google Scholar
- James F. Allen and Patrick J. Hayes. 1985. A common-sense theory of time. In Proceedings of the International Joint Conferences on Artificial Intelligence Organization (IJCAI’85), Aravind K. Joshi (Ed.). Morgan Kaufmann, 528--531.Google Scholar
- Thomas A. Alspaugh. 2005. Software Support for Calculations in Allen’s Interval Algebra. Technical Report.Google Scholar
- Arvind Arasu, Shivnath Babu, and Jennifer Widom. 2006. The CQL continuous query language: Semantic foundations and query execution. VLDB J. 15, 2 (2006), 121--142.Google Scholar
Digital Library
- Arvind Arasu, Mitch Cherniack, Eduardo F. Galvez, David Maier, Anurag Maskey, Esther Ryvkina, Michael Stonebraker, and Richard Tibbetts. 2004. Linear road: A stream data management benchmark. In Proceedings of the International Conference on Very Large Data Bases (VLDB’04), Mario A. Nascimento, M. Tamer Özsu, Donald Kossmann, Renée J. Miller, José A. Blakeley, and K. Bernhard Schiefer (Eds.). Morgan Kaufmann, 480--491.Google Scholar
- Alexander Artikis, Marek Sergot, and Georgios Paliouras. 2010. A logic programming approach to activity recognition. In Proceedings of the 2nd ACM International Workshop on Events in Multimedia (EiMM’10). ACM, New York, NY, 3--8. DOI:https://doi.org/10.1145/1877937.1877941Google Scholar
- Alexander Artikis, Matthias Weidlich, François Schnitzler, Ioannis Boutsis, Thomas Liebig, Nico Piatkowski, Christian Bockermann, Katharina Morik, Vana Kalogeraki, Jakub Marecek, Avigdor Gal, Shie Mannor, Dimitrios Gunopulos, and Dermot Kinane. 2014. Heterogeneous stream processing and crowdsourcing for urban traffic management. In Proceedings of the International Conference on Extending Database Technology (EDBT’14), Sihem Amer-Yahia, Vassilis Christophides, Anastasios Kementsietsidis, Minos N. Garofalakis, Stratos Idreos, and Vincent Leroy (Eds.). 712--723.Google Scholar
- Roger S. Barga, Jonathan Goldstein, Mohamed H. Ali, and Mingsheng Hong. 2007. Consistent streaming through time: A vision for event stream processing, See Reference [4], 363--374.Google Scholar
- Lei Cao, Jiayuan Wang, and Elke A. Rundensteiner. 2016. Sharing-aware outlier analytics over high-volume data streams, See Reference [31], 527--540. DOI:https://doi.org/10.1145/2882903.2882920Google Scholar
- Surajit Chaudhuri, Vagelis Hristidis, and Neoklis Polyzotis (Eds.). 2006. Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM.Google Scholar
- Raffaele Conforti, Marcello La Rosa, and A ter Hofstede. 2018. Timestamp repair for business process event logs. (2018). Retrieved from https://minerva-access.unimelb.edu.au/handle/11343/209011.Google Scholar
- Alan J. Demers, Johannes Gehrke, Biswanath Panda, Mirek Riedewald, Varun Sharma, and Walker M. White. 2007. Cayuga: A general purpose event monitoring system, See Reference [4], 412--422.Google Scholar
- Pedro Domingos and Daniel Lowd. 2009. Markov Logic: An Interface Layer for Artificial Intelligence (1st ed.). Morgan and Claypool Publishers.Google Scholar
Cross Ref
- R. Durrett. 1998. Essentials of Stochastic Processes. Springer-Verlag, Chapter 2, 126--127.Google Scholar
- Opher Etzion and Peter Niblett. 2010. Event Processing in Action. Manning Publications Company. I--XXIV, 1--360 pages.Google Scholar
Digital Library
- Avigdor Gal, Avishai Mandelbaum, François Schnitzler, Arik Senderovich, and Matthias Weidlich. 2017. Traveling time prediction in scheduled transportation with journey segments. Inf. Syst. 64 (2017), 266--280.Google Scholar
Digital Library
- Yeye He, Siddharth Barman, and Jeffrey F. Naughton. 2014. On load shedding in complex event processing. In Proceedings of the 17th International Conference on Database Theory (ICDT’14), Nicole Schweikardt, Vassilis Christophides, and Vincent Leroy (Eds.). 213--224. DOI:https://doi.org/10.5441/002/icdt.2014.23Google Scholar
- Pekka Kaarela, Mika Varjola, Lucas P. J. J. Noldus, and Alexander Artikis. 2011. PRONTO: Support for real-time decision making. In Proceedings of the 5th ACM International Conference on Distributed Event-based System (DEBS’11). ACM, New York, NY, 11--14. DOI:https://doi.org/10.1145/2002259.2002262Google Scholar
- Sailesh Krishnamurthy, Chung Wu, and Michael J. Franklin. 2006. On-the-fly sharing for streamed aggregation, See Reference [16], 623--634.Google Scholar
- Ming Li, Murali Mani, Elke A. Rundensteiner, and Tao Lin. 2011. Complex event pattern detection over streams with interval-based temporal semantics. In Proceedings of the ACM International Conference on Distributed and Event-based Systems (DEBS’11), David M. Eyers, Opher Etzion, Avigdor Gal, Stanley B. Zdonik, and Paul Vincent (Eds.). ACM, 291--302.Google Scholar
- Zheng Li, Tingjian Ge, and Cindy X. Chen. 2013. -Matching: Event processing over noisy sequences in real time. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’13), Kenneth A. Ross, Divesh Srivastava, and Dimitris Papadias (Eds.). ACM, 601--612. DOI:https://doi.org/10.1145/2463676.2463715Google Scholar
- Mo Liu, Elke A. Rundensteiner, Daniel J. Dougherty, Chetan Gupta, Song Wang, Ismail Ari, and Abhay Mehta. 2011. High-performance nested CEP query processing over event streams. In Proceedings of the IEEE International Conference on Data Engineering (ICDE’11), Serge Abiteboul, Klemens Böhm, Christoph Koch, and Kian-Lee Tan (Eds.). IEEE Computer Society, 123--134.Google Scholar
- Yuan Mei and Samuel Madden. 2009. ZStream: A cost-based query processor for adaptively detecting composite events. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’09), Ugur Çetintemel, Stanley B. Zdonik, Donald Kossmann, and Nesime Tatbul (Eds.). ACM, 193--206.Google Scholar
- Barzan Mozafari, Kai Zeng, and Carlo Zaniolo. 2012. High-performance complex event processing over XML streams. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’12), K. Selçuk Candan, Yi Chen, Richard T. Snodgrass, Luis Gravano, and Ariel Fuxman (Eds.). ACM, 253--264.Google Scholar
- Fatma Özcan, Georgia Koutrika, and Sam Madden (Eds.). 2016. Proceedings of the 2016 International Conference on Management of Data (SIGMOD’16). ACM. DOI:https://doi.org/10.1145/2882903Google Scholar
Digital Library
- Nikolaos Panagiotou, Nikolas Zygouras, Ioannis Katakis, Dimitrios Gunopulos, Nikos Zacheilas, Ioannis Boutsis, Vana Kalogeraki, Stephen Lynch, and Brendan O’Brien. 2016. Intelligent urban data monitoring for smart cities. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 177--192.Google Scholar
- Kostas Patroumpas, Elias Alevizos, Alexander Artikis, Marios Vodas, Nikos Pelekis, and Yannis Theodoridis. 2017. Online event recognition from moving vessel trajectories. GeoInformatica 21, 2 (2017), 389--427. DOI:https://doi.org/10.1007/s10707-016-0266-xGoogle Scholar
Digital Library
- Medhabi Ray, Chuan Lei, and Elke A. Rundensteiner. 2016. Scalable pattern sharing on event streams, See Reference [31], 495--510. DOI:https://doi.org/10.1145/2882903.2882947Google Scholar
- Christopher Ré, Julie Letchner, Magdalena Balazinska, and Dan Suciu. 2008. Event queries on correlated probabilistic streams, See Reference [46], 715--728.Google Scholar
- S. Ross. 1997. A First Course in Probability. Prentice Hall.Google Scholar
- Reza Sadri, Carlo Zaniolo, Amir M. Zarkesh, and Jafar Adibi. 2004. Expressing and optimizing sequence queries in database systems. ACM Trans. Datab. Syst. 29, 2 (2004), 282--318.Google Scholar
Digital Library
- Nicholas Poul Schultz-Møller, Matteo Migliavacca, and Peter R. Pietzuch. 2009. Distributed complex event processing with query rewriting. In Proceedings of the ACM International Conference on Distributed and Event-based Systems (DEBS’09), Aniruddha S. Gokhale and Douglas C. Schmidt (Eds.). ACM.Google Scholar
- Anastasios Skarlatidis, Alexander Artikis, Jason Filipou, and Georgios Paliouras. 2015. A probabilistic logic programming event calculus. Theory Pract. Log. Program. 15, 2 (2015), 213--245. DOI:https://doi.org/10.1017/S1471068413000690Google Scholar
Cross Ref
- Anastasios Skarlatidis, Georgios Paliouras, Alexander Artikis, and George A. Vouros. 2015. Probabilistic event calculus for event recognition. ACM Trans. Comput. Log. 16, 2 (2015).Google Scholar
- Shaoxu Song, Yue Cao, and Jianmin Wang. 2016. Cleaning timestamps with temporal constraints. Proc. VLDB 9, 10 (2016), 708--719. DOI:https://doi.org/10.14778/2977797.2977798Google Scholar
Digital Library
- Kento Sugiura and Yoshiharu Ishikawa. 2017. Top-k pattern matching using an information-theoretic criterion over probabilistic data streams. In Proceedings of the 1st International Joint Conference on Web and Big Data (APWeb-WAIM’17), Lei Chen, Christian S. Jensen, Cyrus Shahabi, Xiaochun Yang, and Xiang Lian (Eds.), Lecture Notes in Computer Science,Vol. 10366. Springer, 511--526. DOI:https://doi.org/10.1007/978-3-319-63579-8_39Google Scholar
- Kento Sugiura and Yoshiharu Ishikawa. 2019. Regular expression pattern matching with sliding windows cover probabilistic event streams. In Proceedings of the IEEE International Conference on Big Data and Smart Computing (BigComp’19). IEEE, 1--8. DOI:https://doi.org/10.1109/BIGCOMP.2019.8679331Google Scholar
- David Toman. 1996. Point vs. interval-based query languages for temporal databases. In Proceedings of the ACM International Conference on Management of Data (PODS’96), Richard Hull (Ed.). ACM Press, 58--67.Google Scholar
- Wil M. P. van der Aalst. 2011. Process Mining—Discovery, Conformance and Enhancement of Business Processes. Springer. I--XVI, 1--352 pages.Google Scholar
- Jason Tsong-Li Wang (Ed.). 2008. Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’08). ACM.Google Scholar
- Segev Wasserkrug, Avigdor Gal, Opher Etzion, and Yulia Turchin. 2008. Complex event processing over uncertain data. In Proceedings of the 2nd International Conference on Distributed Event-Based Systems (DEBS’08), Roberto Baldoni (Ed.), Vol. 332. ACM, 253--264. DOI:https://doi.org/10.1145/1385989.1386022Google Scholar
- Matthias Weidlich, Holger Ziekow, Avigdor Gal, Jan Mendling, and Mathias Weske. 2014. Optimizing event pattern matching using business process models. IEEE Trans. Knowl. Data Eng. 26, 11 (2014), 2759--2773. DOI:https://doi.org/10.1109/TKDE.2014.2302306Google Scholar
Cross Ref
- Eugene Wu, Yanlei Diao, and Shariq Rizvi. 2006. High-performance complex event processing over streams, See Reference [16], 407--418.Google Scholar
- Di Yang, Elke A. Rundensteiner, and Matthew O. Ward. 2012. Shared execution strategy for neighbor-based pattern mining requests over streaming windows. ACM Trans. Datab. Syst. 37, 1 (2012), 5.Google Scholar
- Haopeng Zhang, Yanlei Diao, and Neil Immerman. 2013. Recognizing patterns in streams with imprecise timestamps. Inf. Syst. 38, 8 (2013), 1187--1211.Google Scholar
Digital Library
- Haopeng Zhang, Yanlei Diao, and Neil Immerman. 2014. On complexity and optimization of expensive queries in complex event processing. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’14), Curtis E. Dyreson, Feifei Li, and M. Tamer Özsu (Eds.). ACM, 217--228.Google Scholar
- Haopeng Zhang, Yanlei Diao, and Alexandra Meliou. 2017. EXstream: Explaining anomalies in event stream monitoring. In Proceedings of the 20th International Conference on Extending Database Technology (EDBT’17), Volker Markl, Salvatore Orlando, Bernhard Mitschang, Periklis Andritsos, Kai-Uwe Sattler, and Sebastian Breß (Eds.). 156--167. DOI:https://doi.org/10.5441/002/edbt.2017.15Google Scholar
Index Terms
Interval-based Queries over Lossy IoT Event Streams
Recommendations
Processing count queries over event streams at multiple time granularities
Management and analysis of streaming data has become crucial with its applications to web, sensor data, network traffic data, and stock market. Data streams consist of mostly numeric data but what is more interesting are the events derived from the ...
Why Not Match: On Explanations of Event Pattern Queries
SIGMOD '21: Proceedings of the 2021 International Conference on Management of DataQueries over event data are posed in a form of event patterns, for example, to retrieve the flights from IAH to LGA without a stopover. If the expected answer is not returned, one may ask why not, also known as explanations of non-answers. Analogous to ...
Event queries on correlated probabilistic streams
SIGMOD '08: Proceedings of the 2008 ACM SIGMOD international conference on Management of dataA major problem in detecting events in streams of data is that the data can be imprecise (e.g. RFID data). However, current state-ofthe-art event detection systems such as Cayuga [14], SASE [46] or SnoopIB[1], assume the data is precise. Noise in the ...






Comments