Abstract
A new generation of cyber-physical systems has emerged with a large number of devices that continuously generate and consume massive amounts of data in a distributed and mobile manner. Accurate and near real-time decisions based on such streaming data are in high demand in many areas of optimization for such systems. Edge data analytics bring processing power in the proximity of data sources, reduce the network delay for data transmission, allow large-scale distributed training, and consequently help meeting real-time requirements. Nevertheless, the multiplicity of data sources leads to multiple distributed machine learning models that may suffer from sub-optimal performance due to the inconsistency in their states. In this work, we tackle the insularity, concept drift, and connectivity issues in edge data analytics to minimize its accuracy handicap without losing its timeliness benefits. To this end, we propose an efficient model synchronization mechanism for distributed and stateful data analytics. Staleness Control for Edge Data Analytics (SCEDA) ensures the high adaptability of synchronization frequency in the face of an unpredictable environment by addressing the trade-off between the generality and timeliness of the model. Making use of online reinforcement learning, SCEDA has low computational overhead, automatically adapts to changes, and does not require additional data monitoring.
- Marcel R Ackermann, Marcus M"artens, Christoph Raupach, Kamil Swierkot, Christiane Lammersen, and Christian Sohler. 2012. StreamKMGoogle Scholar
- : A clustering algorithm for data streams. Journal of Experimental Algorithmics , Vol. 17 (2012), 2--4.Google Scholar
- Sharad Agarwal, Matthai Philipose, and Paramvir Bahl. 2014. Vision: the case for cellular small cells for cloudlets. In International Workshop on Mobile Cloud Computing & Services. ACM, Bretton Woods, NH, USA, 1--5.Google Scholar
Digital Library
- Joon Ahn, Maheswaran Sathiamoorthy, Bhaskar Krishnamachari, Fan Bai, and Lin Zhang. 2014. Optimizing content dissemination in vehicular networks with radio heterogeneity. IEEE Transactions on Mobile Computing , Vol. 13, 6 (2014), 1312--1325.Google Scholar
Digital Library
- Atakan Aral and Ivona Brandic. 2018. Consistency of the Fittest: Towards Dynamic Staleness Control for Edge Data Analytics. In International European Conference on Parallel and Distributed Computing Workshops. Springer, Turin, Italy, 40--52.Google Scholar
- Atakan Aral and Tolga Ovatman. 2018. A Decentralized Replica Placement Algorithm for Edge Computing. IEEE Transactions on Network and Service Management , Vol. 15, 2 (2018), 516--529.Google Scholar
Cross Ref
- Ashwin Ashok, Peter Steenkiste, and Fan Bai. 2018. Vehicular cloud computing through dynamic computation offloading. Computer Communications , Vol. 120 (2018), 125--137.Google Scholar
Digital Library
- Naveen TR Babu and Christopher Stewart. 2019. Energy, latency and staleness tradeoffs in ai-driven iot. In ACM/IEEE Symposium on Edge Computing. ACM, Washington D.C., USA, 425--430.Google Scholar
Digital Library
- Yael Ben-Haim and Elad Tom-Tov. 2010. A Streaming Parallel Decision Tree Algorithm. Journal of Machine Learning Research , Vol. 11, Feb (2010), 849--872.Google Scholar
Digital Library
- Albert Bifet, Geoff Holmes, Richard Kirkby, and Bernhard Pfahringer. 2010. MOA: Massive Online Analysis. Journal of Machine Learning Research , Vol. 11, May (2010), 1601--1604.Google Scholar
Digital Library
- Keith Bonawitz, Hubert Eichner, Wolfgang Grieskamp, Dzmitry Huba, Alex Ingerman, Vladimir Ivanov, Chloé Kiddon, Jakub Konecný , Stefano Mazzocchi, H. Brendan McMahan, Timon Van Overveldt, David Petrou, Daniel Ramage, and Jason Roselander. 2019. Towards Federated Learning at Scale: System Design. CoRR , Vol. abs/1902.01046 (2019), 15.Google Scholar
- Antonio Brogi, Gabriele Mencagli, Davide Neri, Jacopo Soldani, and Massimo Torquati. 2017. Container-based Support for Autonomic DSP through the Fog. In International Workshop on Autonomic Solutions for Parallel and Distributed Data Stream Processing . Springer, Santiago de Compostela, Spain, 17--28.Google Scholar
- Ignacio Cano, Markus Weimer, Dhruv Mahajan, Carlo Curino, and Giovanni Matteo Fumarola. 2016. Towards Geo-Distributed Machine Learning. CoRR , Vol. abs/1603.09035 (2016), 10.Google Scholar
- Valeria Cardellini, Francesco Lo Presti, Matteo Nardelli, and Gabriele Russo Russo. 2018. Decentralized self-adaptation for elastic Data Stream Processing . Future Generation Computer Systems , Vol. 87 (2018), 171 -- 185.Google Scholar
Digital Library
- Aakanksha Chowdhery, Marco Levorato, Igor Burago, and Sabur Baidya. 2018. Urban IoT Edge Analytics. In Fog computing in the internet of things. Springer, Cham, Switzerland, 101--120.Google Scholar
- James Cipar, Qirong Ho, Jin Kyu Kim, Seunghak Lee, Gregory R Ganger, Garth Gibson, et almbox. 2013. Solving the Straggler Problem with Bounded Staleness.. In Workshop on Hot Topics in Operating Systems, Vol. 13. ACM, Santa Ana Pueblo, NM, USA, 22--22.Google Scholar
- Xavier Corbillon, Francesca De Simone, and Gwendal Simon. 2017. 360-degree video head movement dataset. In ACM Conference on Multimedia Systems . ACM, Taipei, Taiwan, 199--204.Google Scholar
Digital Library
- Marcos Dias de Assuncao, Alexandre da Silva Veith, and Rajkumar Buyya. 2018. Distributed Data Stream Processing and Edge Computing: A Survey on Resource Elasticity and Future Directions. Journal of Network and Computer Applications , Vol. 103 (2018), 1--17.Google Scholar
Digital Library
- Jo ao Duarte, Jo ao Gama, and Albert Bifet. 2016. Adaptive model rules from high-speed data streams. ACM Transactions on Knowledge Discovery from Data , Vol. 10, 3 (2016), 30.Google Scholar
- Melike Erol-Kantarci, Jahangir H Sarker, and Hussein T Mouftah. 2011. Communication-based plug-in hybrid electrical vehicle load management in the smart grid. In IEEE Symposium on Computers and Communications. IEEE, Corfu, Greece, 404--409.Google Scholar
Digital Library
- Jo ao Gama, Indr.e vZ liobait.e, Albert Bifet, Mykola Pechenizkiy, and Abdelhamid Bouchachia. 2014. A survey on concept drift adaptation. ACM computing surveys , Vol. 46, 4 (2014), 44.Google Scholar
- Priya Goyal, Piotr Dollá r, Ross B. Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He. 2017. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour. CoRR , Vol. abs/1706.02677 (2017), 12.Google Scholar
- Takahiro Hara and Sanjay Kumar Madria. 2005. Consistency Management Among Replicas in Peer-To-Peer Mobile Ad Hoc Networks. In IEEE Symposium on Reliable Distributed Systems. IEEE, Orlando, FL, USA, 3--12.Google Scholar
- Benjamin Heintz, Abhishek Chandra, and Ramesh K Sitaraman. 2016. Trading timeliness and accuracy in geo-distributed streaming analytics. In ACM Symposium on Cloud Computing. ACM, Santa Clara, CA, USA, 361--373.Google Scholar
Digital Library
- Qirong Ho, James Cipar, Henggang Cui, Seunghak Lee, Jin Kyu Kim, Phillip B Gibbons, Garth A Gibson, Greg Ganger, and Eric P Xing. 2013. More Effective Distributed ML via a Stale Synchronous Parallel Parameter Server. In Conference on Neural Information Processing Systems. Curran Associates, Lake Tahoe, NV, USA, 1223--1231.Google Scholar
- Kevin Hsieh, Aaron Harlap, Nandita Vijaykumar, Dimitris Konomis, Gregory R Ganger, Phillip B Gibbons, and Onur Mutlu. 2017. Gaia: Geo-Distributed Machine Learning Approaching $$LAN$$ Speeds. In USENIX Symposium on Networked Systems Design and Implementation. USENIX, Boston, MA, USA, 629--647.Google Scholar
- Chien-Chun Hung, Ganesh Ananthanarayanan, Peter Bodik, Leana Golubchik, Minlan Yu, Paramvir Bahl, and Matthai Philipose. 2018. Videoedge: Processing camera streams using hierarchical clusters. In ACM/IEEE Symposium on Edge Computing. IEEE, Seattle, WA, USA, 115--131.Google Scholar
Cross Ref
- Teerawat Issariyakul and Ekram Hossain. 2012. Introduction to Network Simulator NS2 .Springer, New York, NY, USA.Google Scholar
- Zhiyuan Jiang, Bhaskar Krishnamachari, Xi Zheng, Sheng Zhou, and Zhisheng Niu. 2018. Decentralized Status Update for Age-of-Information Optimization in Wireless Multiaccess Channels. In IEEE International Symposium on Information Theory. IEEE, Vail, CO, USA, 2276--2280.Google Scholar
- Clement Kam, Sastry Kompella, Gam D Nguyen, Jeffrey E Wieselthier, and Anthony Ephremides. 2018. On the age of information with packet deadlines. IEEE Transactions on Information Theory , Vol. 64, 9 (2018), 6419--6428.Google Scholar
Digital Library
- Sanjit Kaul, Marco Gruteser, Vinuth Rai, and John Kenney. 2011. Minimizing age of information in vehicular networks. In IEEE Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks . IEEE, Salt Lake City, UT, USA, 350--358.Google Scholar
Cross Ref
- Antzela Kosta, Nikolaos Pappas, Anthony Ephremides, and Vangelis Angelakis. 2018. The Cost of Delay in Status Updates and their Value: Non-linear Ageing. CoRR , Vol. abs/1812.09320 (2018), 32.Google Scholar
- Qiaobin Kuang, Jie Gong, Xiang Chen, and Xiao Ma. 2019. Age-of-Information for Computation-Intensive Messages in Mobile Edge Computing. CoRR , Vol. abs/1901.01854 (2019), 6.Google Scholar
- Nicholas D Lane, Sourav Bhattacharya, Akhil Mathur, Petko Georgiev, Claudio Forlivesi, and Fahim Kawsar. 2017. Squeezing deep learning into mobile and embedded devices. IEEE Pervasive Computing , Vol. 16, 3 (2017), 82--88.Google Scholar
Digital Library
- Joo Hwan Lee, Jaewoong Sim, and Hyesoon Kim. 2015. BSSync: Processing near memory for machine learning workloads with bounded staleness consistency models. In International Conference on Parallel Architecture and Compilation. IEEE, San Francisco, CA, USA, 241--252.Google Scholar
Digital Library
- Ilias Leontiadis, Paolo Costa, and Cecilia Mascolo. 2009. A hybrid approach for content-based publish/subscribe in vehicular networks. Pervasive and Mobile Computing , Vol. 5, 6 (2009), 697--713.Google Scholar
Digital Library
- Lihong Li. 2012. Sample complexity bounds of exploration. In Reinforcement Learning . Springer, Berlin, Germany, 175--204.Google Scholar
- Mu Li, David G Andersen, Jun Woo Park, Alexander J Smola, Amr Ahmed, Vanja Josifovski, et almbox. 2014. Scaling Distributed Machine Learning with the Parameter Server. In USENIX Conference on Operating Systems Design and Implementation. USENIX, Broomfield, CO, USA, 583--598.Google Scholar
- Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. CoRR , Vol. abs/1509.02971 (2015), 14.Google Scholar
- Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-Efficient Learning of Deep Networks from Decentralized Data. In International Conference on Artificial Intelligence and Statistics . PMLR, Lauderdale, FL, USA, 1273--1282.Google Scholar
- Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et almbox. 2015. Human-level control through deep reinforcement learning. Nature , Vol. 518, 7540 (2015), 529--533.Google Scholar
- Gianmarco De Francisci Morales and Albert Bifet. 2015. SAMOA: Scalable Advanced Massive Online Analysis. Journal of Machine Learning Research , Vol. 16, 1 (2015), 149--153.Google Scholar
Digital Library
- Hussein T. Mouftah and Melike Erol-Kantarci. 2013. Smart Grid Communications: Opportunities and Challenges. In Handbook of Green Information and Communication Systems. Elsevier, Amsterdam, The Netherlands, 631--663.Google Scholar
- Seyed Ali Osia, Ali Shahin Shamsabadi, Ali Taheri, Hamid R Rabiee, and Hamed Haddadi. 2018. Private and Scalable Personal Data Analytics Using Hybrid Edge-to-Cloud Deep Learning. Computer , Vol. 51, 5 (2018), 42--49.Google Scholar
Cross Ref
- Nikunj C Oza. 2005. Online bagging and boosting. In IEEE international conference on systems, man and cybernetics, Vol. 3. IEEE, Waikoloa, HI, USA, 2340--2345.Google Scholar
Cross Ref
- Minsu Park, Mor Naaman, and Jonah Berger. 2016. A data-driven study of view duration on Youtube. In International AAAI Conference on Web and Social Media. AAAI Press, Cologne, Germany, 651--654.Google Scholar
- Pankesh Patel, Muhammad Intizar Ali, and Amit Sheth. 2017. On Using the Intelligent Edge for IoT Analytics . IEEE Intelligent Systems , Vol. 32, 5 (2017), 64--69.Google Scholar
Cross Ref
- Qifan Pu, Ganesh Ananthanarayanan, Peter Bodik, Srikanth Kandula, Aditya Akella, Paramvir Bahl, and Ion Stoica. 2015. Low latency geo-distributed data analytics. ACM SIGCOMM Computer Communication Review , Vol. 45, 4 (2015), 421--434.Google Scholar
Digital Library
- Bozhao Qi, Lei Kang, and Suman Banerjee. 2017. A vehicle-based edge computing platform for transit and human mobility analytics. In ACM/IEEE Symposium on Edge Computing. ACM, San Jose, CA, USA, 1--14.Google Scholar
Digital Library
- Rajiv Ranjan. 2014. Streaming Big Data Processing in Datacenter Clouds. IEEE Cloud Computing , Vol. 1, 1 (2014), 78--83.Google Scholar
Cross Ref
- George F Riley and Thomas R Henderson. 2010. The ns-3 network simulator. In Modeling and Tools for Network Simulation. Springer, Berlin, Germany, 15--34.Google Scholar
- Peter J Rousseeuw. 1987. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. , Vol. 20 (1987), 53--65.Google Scholar
Digital Library
- Mahadev Satyanarayanan. 2017. The emergence of edge computing. Computer , Vol. 50, 1 (2017), 30--39.Google Scholar
Digital Library
- Mahadev Satyanarayanan, Paramvir Bahl, Ramón Caceres, and Nigel Davies. 2009. The Case for VM-based Cloudlets in Mobile Computing. IEEE Pervasive Computing , Vol. 8, 4 (2009), 14--23.Google Scholar
Digital Library
- Weisong Shi, Jie Cao, Quan Zhang, Youhuizi Li, and Lanyu Xu. 2016. Edge computing: Vision and challenges . IEEE Internet of Things Journal , Vol. 3, 5 (2016), 637--646.Google Scholar
Cross Ref
- Alexander L Strehl, Lihong Li, Eric Wiewiora, John Langford, and Michael L Littman. 2006. PAC model-free reinforcement learning. In International conference on Machine learning . ACM, New York, NY, USA, 881--888.Google Scholar
Digital Library
- Alexander Styler, Gregg Podnar, Paul Dille, Matthew Duescher, Christopher Bartley, and Illah Nourbakhsh. 2011. Active management of a heterogeneous energy store for electric vehicles. In IEEE Forum on Integrated and Sustainable Transportation Systems. IEEE, Vienna, Austria, 20--25.Google Scholar
- Masashi Sugiyama, Neil D Lawrence, Anton Schwaighofer, et almbox. 2017. Dataset Shift in Machine Learning .The MIT Press, Cambridge, MA, USA.Google Scholar
- Huangshi Tian, Minchen Yu, and Wei Wang. 2018. Continuum: A Platform for Cost-Aware, Low-Latency Continual Learning.. In ACM Symposium on Cloud Computing. ACM, New York, NY, USA, 26--40.Google Scholar
Digital Library
- Leslie G Valiant. 1984. A theory of the learnable. In ACM Symposium on Theory of Computing . ACM, New York, NY,US, 436--445.Google Scholar
Digital Library
- Christopher JCH Watkins and Peter Dayan. 1992. Q-learning. Machine Learning , Vol. 8, 3--4 (1992), 279--292.Google Scholar
Digital Library
- Geoffrey I Webb, Roy Hyde, Hong Cao, Hai Long Nguyen, and Francois Petitjean. 2016. Characterizing concept drift. Data Mining and Knowledge Discovery , Vol. 30, 4 (2016), 964--994.Google Scholar
Digital Library
- Eric P Xing, Qirong Ho, Wei Dai, Jin Kyu Kim, Jinliang Wei, Seunghak Lee, Xun Zheng, Pengtao Xie, Abhimanu Kumar, and Yaoliang Yu. 2015. Petuum: A New Platform for Distributed Machine Learning on Big Data. IEEE Transactions on Big Data , Vol. 1, 2 (2015), 49--67.Google Scholar
Cross Ref
- Haifeng Yu and Amin Vahdat. 2002. Design and evaluation of a conit-based continuous consistency model for replicated services. ACM Transactions on Computer Systems , Vol. 20, 3 (2002), 239--282.Google Scholar
Digital Library
- Chaoyun Zhang, Paul Patras, and Hamed Haddadi. 2019. Deep Learning in Mobile and Wireless Networking: A Survey. IEEE Communications Surveys & Tutorials , Vol. 21, 3 (2019), 2224--2287.Google Scholar
Cross Ref
Index Terms
Staleness Control for Edge Data Analytics
Recommendations
Staleness Control for Edge Data Analytics
SIGMETRICS '20: Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer SystemsA new generation of cyber-physical systems has emerged with a large number of devices that continuously generate and consume massive amounts of data in a distributed and mobile manner. Accurate and near real-time decisions based on such streaming data ...
Staleness Control for Edge Data Analytics
A new generation of cyber-physical systems has emerged with a large number of devices that continuously generate and consume massive amounts of data in a distributed and mobile manner. Accurate and near real-time decisions based on such streaming data ...






Comments