Abstract
In the industrial Internet of Things domain, applications are moving from the Cloud into the Edge, closer to the devices producing and consuming data. This means that applications move from the scalable and homogeneous Cloud environment into a potentially constrained heterogeneous Edge network. Making Edge applications reliable enough to fulfill Industry 4.0 use cases remains an open research challenge. Maintaining operation of an Edge system requires advanced management techniques to mitigate the failure of devices. This article tackles this challenge with a twofold approach: (1) a policy-enabled failure detector that enables adaptable failure detection and (2) an allocation component for the efficient selection of failure mitigation actions. The parameters and performance of the failure detection approach are evaluated, and the performance of an energy-efficient allocation technique is measured. Finally, a vision for a complete system and an example use case are presented.
- M. Bertier, O. Marin, and P. Sens. 2002. Implementation and performance evaluation of an adaptable failure detector. In Proceedings of the International Conference on Dependable Systems and Networks. 354--363. DOI:https://doi.org/10.1109/DSN.2002.1028920Google Scholar
Cross Ref
- Valeria Cardellini, Vincenzo Grassi, Francesco Lo Presti, and Matteo Nardelli. 2016. Optimal operator placement for distributed stream processing applications. In Proceedings of the 10th ACM International Conference on Distributed and Event-Based Systems. ACM, New York, NY, 69--80. DOI:https://doi.org/10.1145/2933267.2933312Google Scholar
Digital Library
- Tushar Deepak Chandra and Sam Toueg. 1996. Unreliable failure detectors for reliable distributed systems. Journal of the ACM 43, 2 (March 1996), 225--267. DOI:https://doi.org/10.1145/226643.226647Google Scholar
Digital Library
- Wei Chen, S. Toueg, and M. K. Aguilera. 2002. On the quality of service of failure detectors. IEEE Transactions on Computers 51, 1 (Jan. 2002), 13--32. DOI:https://doi.org/10.1109/12.980014Google Scholar
- Shiva Chetan, Anand Ranganathan, and R. Campbell. 2005. Towards fault tolerance pervasive computing. IEEE Technology and Society Magazine 24, 1 (2005), 38--44. DOI:https://doi.org/10.1109/MTAS.2005.1407746Google Scholar
Cross Ref
- Anubis Graciela De Moraes Rossetto, Carlos O. Rolim, Valderi Leithardt, Guilherme A. Borges, Cláudio F. R. Geyer, Luciana Arantes, and Pierre Sens. 2015. A new unreliable failure detector for self-healing in ubiquitous environments. In Proceedings of the International Conference on Advanced Information Networking and Applications (AINA’15). DOI:https://doi.org/10.1109/AINA.2015.201Google Scholar
Cross Ref
- Edsger W. Dijkstra. 1974. Self-stabilizing systems in spite of distributed control. Communications of the ACM 17, 11 (1974), 643--644.Google Scholar
Digital Library
- X. Défago, N. Hayashibara, R. Yared, and T. Katayama. 2004. The ɸ accrual failure detector. In Proceedings of the IEEE Symposium on Reliable Distributed Systems (SRDS’04). 66--78. DOI:https://doi.org/10.1109/RELDIS.2004.1353004Google Scholar
- K. Fysarakis, G. Panoudakis, N. Petroulakis, O. Soultatos, A. Bröring, and T. Marktscheffel. 2019. Architectural patterns for secure IoT orchestrations. In Proceedings of the Global Internet of Things Summit (GIoTS’19). IEEE, Los Alamitos, CA.Google Scholar
- N. K. Giang, M. Blackstock, R. Lea, and V. C. M. Leung. 2015. Developing IoT applications in the Fog: A distributed dataflow approach. In Proceedings of the 2015 5th International Conference on the Internet of Things (IOT’15). 155--162. DOI:https://doi.org/10.1109/IOT.2015.7356560Google Scholar
Cross Ref
- Sila Ozen Guclu, Tanir Ozcelebi, and Johan Lukkien. 2016. Distributed fault detection in smart spaces based on trust management. Procedia Computer Science 83 (Jan. 2016), 66--73. DOI:https://doi.org/10.1016/j.procs.2016.04.100Google Scholar
- Andreas Moregård Haubenwaller and Konstantinos Vandikas. 2015. Computations on the edge in the Internet of Things. Procedia Computer Science 52 (2015), 29--34.Google Scholar
Cross Ref
- W. Z. Khan, M. Y. Aalsalem, M. K. Khan, M. S. Hossain, and M. Atiquzzaman. 2017. A reliable Internet of Things based architecture for oil and gas industry. In Proceedings of the 2017 19th International Conference on Advanced Communication Technology (ICACT’17). 705--710. DOI:https://doi.org/10.23919/ICACT.2017.7890184Google Scholar
Cross Ref
- Palanivel A. Kodeswaran, Ravi Kokku, Sayandeep Sen, and Mudhakar Srivatsa. 2016. Idea: A system for efficient failure management in smart IoT environments. In Proceedings of the 14th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys’16). ACM, New York, NY, 43--56. DOI:https://doi.org/10.1145/2906388.2906406Google Scholar
Digital Library
- S. Krügel, J. Maierhofer, T. Thümmel, and D. J. Rixen. 2019. Rotor model reduction for wireless sensor node based monitoring systems. In Proceedings of the 13th International Conference on Dynamics of Rotating Machines.Google Scholar
- G. T. Lakshmanan, Y. Li, and R. Strom. 2008. Placement strategies for Internet-scale data stream systems. IEEE Internet Computing 12, 6 (Nov. 2008), 50--60. DOI:https://doi.org/10.1109/MIC.2008.129Google Scholar
Digital Library
- Yunbo Li, Anne-Cécile Orgerie, Ivan Rodero, Betsegaw Lemma Amersho, Manish Parashar, and Jean-Marc Menaud. 2018. End-to-end energy models for Edge Cloud-based IoT platforms: Application to data stream analysis in IoT. Future Generation Computer Systems 87 (2018), 667–678. DOI:https://doi.org/10.1016/j.future.2017.12.048Google Scholar
Digital Library
- Jiaxi Liu, Zhibo Wu, Jian Dong, Jin Wu, and Dongxin Wen. 2018. An energy-efficient failure detector for vehicular cloud computing. PLOS ONE 13, 1 (Jan. 2018), e0191577. DOI:https://doi.org/10.1371/journal.pone.0191577Google Scholar
- Nitinder Mohan and Jussi Kangasharju. 2016. Edge-Fog Cloud: A distributed cloud for Internet of Things computations. In Proceedings of the 2016 Cloudification of the Internet of Things (CIoT’16). IEEE, Los Alamitos, CA, 1--6.Google Scholar
Cross Ref
- J. A. Nelder and R. Mead. 1965. A simplex method for function minimization. Computer Journal 7, 4 (Jan. 1965), 308--313. DOI:https://doi.org/10.1093/comjnl/7.4.308Google Scholar
Cross Ref
- G. Terry Ross and Richard M. Soland. 1975. A branch and bound algorithm for the generalized assignment problem. Mathematical Programming 8, 1 (Dec. 1975), 91--103. DOI:https://doi.org/10.1007/BF01580430Google Scholar
- M. Ruta, F. Scioscia, G. Loseto, and E. Di Sciascio. 2014. Semantic-based resource discovery and orchestration in home and building automation: A multi-agent approach. IEEE Transactions on Industrial Informatics 10, 1 (Feb. 2014), 730--741. DOI:https://doi.org/10.1109/TII.2013.2273433Google Scholar
Cross Ref
- Yuvraj Sahni, Jiannong Cao, Shigeng Zhang, and Lei Yang. 2017. Edge Mesh: A new paradigm to enable distributed intelligence in Internet of Things. IEEE Access 5 (2017), 16441--16458.Google Scholar
Cross Ref
- Farzad Samie, Vasileios Tsoutsouras, Lars Bauer, Sotirios Xydis, Dimitrios Soudris, and Jörg Henkel. 2016. Computation offloading and resource allocation for low-power IoT edge devices. In Proceedings of the 2016 IEEE 3rd World Forum on Internet of Things (WF-IoT’16). IEEE, Los Alamitos, CA, 7--12.Google Scholar
Cross Ref
- Stefania Sardellitti, Gesualdo Scutari, and Sergio Barbarossa. 2015. Joint optimization of radio and computational resources for multicell mobile-edge computing. IEEE Transactions on Signal and Information Processing over Networks 1, 2 (2015), 89--103.Google Scholar
- Benjamin Satzger, Andreas Pietzowski, Wolfgang Trumler, and Theo Ungerer. 2007. A new adaptive accrual failure detector for dependable distributed systems. In Proceedings of the 2007 ACM Symposium on Applied Computing (SAC’07). ACM, New York, NY, 551--555. DOI:https://doi.org/10.1145/1244002.1244129Google Scholar
Digital Library
- J. Seeger, A. Bröring, M.-O. Pahl, and E. Sakic. 2019. Rule-based translation of application-level QoS constraints into SDN configurations for the IoT. In Proceedings of the 2019 European Conference on Networks and Communications (EuCNC’19). IEEE, Los Alamitos, CA.Google Scholar
- Jan Seeger, Rohit A. Deshmukh, and Arne Bröring. 2018. Running distributed and dynamic IoT choreographies. In Proceedings of the 2018 IEEE Global Internet of Things Summit (GIoTS’18), Vol. 2. IEEE, Los Alamitos, CA, 33--38. http://arxiv.org/abs/1802.03159 arXiv: 1802.03159.Google Scholar
Cross Ref
- J. Seeger, R. A. Deshmukh, V. Sarafov, and A. Bröring. 2019. Dynamic IoT choreographies. IEEE Pervasive Computing 18, 1 (Jan. 2019), 19--27. DOI:https://doi.org/10.1109/MPRV.2019.2907003Google Scholar
Digital Library
- Quan Z. Sheng, Xiaoqiang Qiao, Athanasios V. Vasilakos, Claudia Szabo, Scott Bourne, and Xiaofei Xu. 2014. Web services composition: A decade’s overview. Information Sciences 280 (Oct. 2014), 218--238. DOI:https://doi.org/10.1016/j.ins.2014.04.054 WOS:000339132700014.Google Scholar
- W. Shi, J. Cao, Q. Zhang, Y. Li, and L. Xu. 2016. Edge computing: Vision and challenges. IEEE Internet of Things Journal 3, 5 (Oct. 2016), 637--646. DOI:https://doi.org/10.1109/JIOT.2016.2579198Google Scholar
- Andrew S. Tanenbaum and Maarten van Steen. 2007. Distributed Systems: Principles and Paradigms (2nd ed.). Pearson Education.Google Scholar
Digital Library
- Aparna Saisree Thuluva, Arne Bröring, Ganindu P. Medagoda, Hettige Don, Darko Anicic, and Jan Seeger. 2017. Recipes for IoT applications. In Proceedings of the 7th International Conference on the Internet of Things (IoT’17). ACM, New York, NY, Article 10, 8 pages. DOI:https://doi.org/10.1145/3131542.3131553Google Scholar
Digital Library
- Aparna Saisree Thuluva, Kirill Dorofeev, Monika Wenger, Darko Anicic, and Sebastian Rudolph. 2017. Semantic-based approach for low-effort engineering of automation systems. In On the Move to Meaningful Internet Systems. OTM 2017 Conferences. Lecture Notes in Computer Science, Vol. 10574. Springer, 497–512. DOI:https://doi.org/10.1007/978-3-319-69459-7_33Google Scholar
- Blase Ur, Melwyn Pak Yong Ho, Stephen Brawner, Jiyun Lee, Sarah Mennicken, Noah Picard, Diane Schulze, and Michael L. Littman. 2016. Trigger-action programming in the wild: An analysis of 200,000 IFTTT recipes. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI’16). ACM, New York, NY, 3227--3231. DOI:https://doi.org/10.1145/2858036.2858556 event-place: San Jose, California, USA.Google Scholar
- Shu-Ching Wang, Kuo-Qin Yan, Wen-Pin Liao, and Shun-Sheng Wang. 2010. Towards a load balancing in a three-level cloud computing network. In Proceedings of the 2010 3rd International Conference on Computer Science and Information Technology, Vol. 1. 108--113. DOI:https://doi.org/10.1109/ICCSIT.2010.5563889Google Scholar
Index Terms
Optimally Self-Healing IoT Choreographies
Recommendations
On Quiescent Reliable Communication
We study the problem of achieving reliable communication with quiescent algorithms (i.e., algorithms that eventually stop sending messages) in asynchronous systems with process crashes and lossy links. We first show that it is impossible to solve this ...
On the Implementation of Unreliable Failure Detectors in Partially Synchronous Systems
Unreliable failure detectors were proposed by Chandra and Toueg as mechanisms that provide information about process failures. Chandra and Toueg defined eight classes of failure detectors, depending on how accurate this information is, and presented an ...
Rejuvenation and Failure Detection in Partitionable Systems
PRDC '01: Proceedings of the 2001 Pacific Rim International Symposium on Dependable ComputingCertain gateways (e.g., some cable or DSL modems)are known to have low reliability and low availability.Most failures of these devices can however be "fixed"by rejuvenating the device after a failure has been detected.Such a detection based rejuvenation ...






Comments