Abstract
In urban Internet of Things (IoT) environments, data generated in real time could be processed by analytical applications in online or offline mode. In the management perspective of runtime environments, such modes can hardly be supported in a unified framework under multiple restrictions such as latency, utility, and QoS (quality of service). Meanwhile in the optimization perspective of specific applications, it is difficult for current infrastructure to efficiently allocate sufficient resources to tasks of an application, simultaneously considering multiple factors such as data size, velocity, and locality. In this article, two task allocation methods are proposed for batch and stream analytics to improve resource utility with auto-scaling guarantee when an analytical application is submitted or sudden workloads appear. Taking the highway domain as an example, the task allocation methods are implemented in a novel combined framework accordingly. Using both real-world and simulated data, extensive experiments show that our methods can improve utility efficiency with effective offload support.
- T. Akidau, A. Balikov, K. Bekiroglu, S. Chernyak, J. Haberman, R. Lax, S. Mcveety, D. Mills, P. Nordstrom, and S. Whittle. 2013. MillWheel: Fault-tolerant stream processing at internet scale. In Proceedings of the 39th International Conference on Very Large Data Bases (VLDB’13). 734--745.Google Scholar
- Q. Anderson. 2013. Storm Real-time Processing Cookbook. Packt Publishing Ltd.Google Scholar
- L. Aniello, R. Baldoni, and L. Querzoni. 2013. Adaptive online scheduling in storm. In Proceedings of the Proceedings of the 7th ACM international Conference on Distributed Event-based Systems. ACM, 207--218.Google Scholar
- O. Boykin, S. Ritchie, I. O'Connell, and J. Lin. 2014. Summingbird: A framework for integrating batch and online MapReduce computations. Proc. VLDB Endow. 7, 13 (2014), 1441--1451.Google Scholar
Digital Library
- Y. Cao and H. Wang. 2015. The key technologies of real-time processing large scale microblog data stream. In Cloud Computing and Big Data, W. Qiang, X. Zheng, and C.-H. Hsu (Eds.). Springer International Publishing, Cham, 295--306.Google Scholar
- P. Carbone, S. Ewen, S. Haridi, A. Katsifodimos, V. Markl, and K. Tzoumas. 2015. Apache Flink: Stream and batch processing in a single engine. IEEE Bull. IEEE Comput. Soc. Techn. Commit. Data Eng. 36, 4 (2015), 28--38.Google Scholar
- G. J. Chen, J. L. Wiener, S. Iyer, A. Jaiswal, R. Lei, N. Simha, W. Wang, K. Wilfong, T. Williamson, and S. Yilmaz. 2016. Realtime data processing at facebook. In Proceedings of the 2016 International Conference on Management of Data. ACM, 1087--1098.Google Scholar
- H. Cho, H. Shiokawa, and H. Kitagawa. 2016. Jsflow: Integration of massive streams and batches via json-based dataflow algebra. In Proceedings of the 19th International Conference on Network-Based Information Systems (NBiS’16). IEEE, 188--195.Google Scholar
- O. M. Dias De Assun, A. Da Silva Veith, and R. Buyya. 2018. Distributed data stream processing and edge computing: A survey on resource elasticity and future directions. J. Netw. Comput. Appl. 103 (2018), 1--17.Google Scholar
- W. Ding, Y. Han, J. Wang, and Z. Zhao. 2014. Feature-based high-availability mechanism for quantile tasks in real-time data stream processing. Softw.: Pract. Exper. 44, 7 (2014), 855--871.Google Scholar
Digital Library
- W. Ding, X. Wang, and Z. Zhao. 2020. CO-STAR: A collaborative prediction service for short-term trends on continuous spatio-temporal data. Fut. Gener. Comput. Syst. 102 (2020), 481--493.Google Scholar
Cross Ref
- W. Ding, S. Zhang, and Z. Zhao. 2017. A collaborative calculation on real-time stream in smart cities. Simul. Model. Pract. Theory 73 (2017), 72--82.Google Scholar
- W. Ding and Z. Zhao. 2018. DS-Harmonizer: A harmonization service on spatio-temporal data stream in edge computing environment. Wireless Commun. Mobile Comput. 2018 (2018), 12.Google Scholar
- W. Ding, Z. Zhao, and Y. Han. 2016. An adaptive replica mechanism for real-time stream processing. In Proceedings of the 2016 International IEEE Conferences on Ubiquitous Intelligence 8 Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld’16). IEEE, 449--455.Google Scholar
- W. Ding, Z. Zhao, and Y. Han. 2016. A framework to improve the availability of stream computing. In Proceedings of the 23rd IEEE International Conference on Web Services (ICWS’16). IEEE, 594--601.Google Scholar
- W. Ding, J. Zou, and Z. Zhao. 2020. A multidimensional service template for data analysis in highway domain. International Journal of Internet Manufacturing and Services 4, 4 (2020), 290--306.Google Scholar
- I. Farris, L. Militano, M. Nitti, L. Atzori, and A. Iera. 2015. Federated edge-assisted mobile clouds for service provisioning in heterogeneous IoT environments. In Proceedings of the 2015 IEEE 2nd World Forum on Internet of Things (WF-IoT’15). 591--596.Google Scholar
- A. J. Ferrer, J. M. Marques, and J. Jorba. 2019. Towards the decentralised Cloud: Survey on approaches and challenges for mobile, ad hoc, and edge computing. ACM Comput. Surv. 51 (2019), 1--36.Google Scholar
Digital Library
- B. I. Ismail, E. M. Goortani, M. B. A. Karim, W. M. Tat, S. Setapa, J. Y. Luke, and O. H. Hoe. 2015. Evaluation of Docker as Edge computing platform. In Proceedings of the IEEE Conference on Open Systems (ICOS’15). 130--135.Google Scholar
- P. P. Jayaraman, J. B. Gomes, H. Nguyen, Z. S. Abdallah, S. Krishnaswamy, and A. Zaslavsky. 2015. Scalable energy-efficient distributed data analytics for crowdsensing applications in mobile environments. IEEE Trans. Comput. Soc. Syst. 2 (2015), 109--123.Google Scholar
Cross Ref
- T. Kolajo, O. Daramola, and A. Adebiyi. 2019. Big data stream analysis: A systematic literature review. J. Big Data 6 (2019), 47.Google Scholar
Cross Ref
- J. H. Lampton. 2015. Pig squeal: Bridging batch and stream processing using incremental updates. In Computer Science. University of Maryland.Google Scholar
- B. Li. 2015. A platform for scalable low-latency analytics using mapreduce. In Computer Science. University of Massachusetts--Amherst, 378.Google Scholar
- T. Locher and A. C. Sima. 2016. Cyclone: Unified stream and batch processing. In Proceedings of the 45th International Conference on Parallel Processing Workshops (ICPPW’16). IEEE, 220--229.Google Scholar
- B. Lohrmann, P. Janacik, and O. Kao. 2015. Elastic stream processing with latency guarantees. In Proceedings of the International Conference on Distributed Computing Systems (ICDCS’15).Google Scholar
- P. G. Lopez, A. Montresor, D. Epema, A. Datta, T. Higashino, A. Iamnitchi, M. Barcellos, P. Felber, and E. Riviere. 2015. Edge-centric computing: Vision and challenges. SIGCOMM Comput. Commun. Rev. 45 (2015), 37--42.Google Scholar
Digital Library
- Y. Lv, Y. Duan, W. Kang, Z. Li, and F. Y. Wang. 2015. Traffic flow prediction with big data: A deep learning approach. IEEE Trans. Intell. Transport. Syst. 16 (2015), 865--873.Google Scholar
- P. Mach and Z. Becvar. 2017. Mobile edge computing: A survey on architecture and computation offloading. IEEE Commun. Surv. Tutor. 19 (2017), 1628--1656.Google Scholar
Digital Library
- A. Machen, S. Wang, K. K. Leung, B. J. Ko, and T. Salonidis. 2016. Migrating running applications across mobile edge clouds: poster. In Proceedings of the 22nd Annual International Conference on Mobile Computing and Networking. ACM, New York City, NY, 435--436.Google Scholar
- R. Mahmud, R. Kotagiri, and R. Buyya. 2018. Fog Computing: A taxonomy, survey and future directions. In Internet of Everything: Algorithms, Methodologies, Technologies and Perspectives, B. Di Martino, K.-C. Li, L. T. Yang, and A. Esposito (Eds.). Springer, Singapore, 103--130.Google Scholar
- M. Masdari, F. Salehi, M. Jalali, and M. Bidaki. 2017. A survey of pso-based scheduling algorithms in cloud computing. J. Netw. Syst. Manag. 25 (2017), 122--158.Google Scholar
Digital Library
- M. Masdari, S. Valikardan, Z. Shahi, and S. I. Azar. 2016. Towards workflow scheduling in cloud computing: A comprehensive analysis. J. Netw. Comput. Appl. 66 (2016), 64--82.Google Scholar
Digital Library
- M. Mukherjee, L. Shu, and D. Wang. 2018. Survey of fog computing: Fundamental, network applications, and research challenges. IEEE Commun. Surv. Tutor. 20 (2018), 1826--1857.Google Scholar
Cross Ref
- M. S. H. Nazmudeen, A. T. Wan, and S. M. Buhari. 2016. Improved throughput for Power Line Communication (PLC) for smart meters using fog computing based data aggregation approach. In Proceedings of the 2016 IEEE International Smart Cities Conference (ISC2’16). 1--4.Google Scholar
- C. Pahl, S. Helmer, L. Miori, J. Sanin, and B. Lee. 2016. A Container-Based Edge Cloud PaaS Architecture Based on Raspberry Pi Clusters. In Proceedings of the 2016 IEEE 4th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW’16). 117--124.Google Scholar
- B. Peng, M. Hosseini, Z. Hong, R. Farivar, and R. Campbell. 2015. R-Storm: Resource-aware scheduling in storm. In Proceedings of the 16th Annual Middleware Conference. ACM, 149--161.Google Scholar
- C. Perera, Y. Qin, J. C. Estrella, S. Reiff-Marganiec, and A. V. Vasilakos. 2017. Fog computing for sustainable smart cities: A survey. ACM Comput. Surv. 50 (2017), 1--43.Google Scholar
Digital Library
- G. Premsankar, M. D. Francesco, and T. Taleb. 2018. Edge computing for the Internet of Things: A case study. IEEE IoT J. 5 (2018), 1275--1284.Google Scholar
- Z. Qian, Y. He, C. Su, Z. Wu, H. Zhu, T. Zhang, L. Zhou, Y. Yu, and Z. Zhang. 2013. TimeStream: Reliable stream computation in the cloud. In Proceedings of the European Conference on Computer Systems (Eurosys’13).Google Scholar
- B. T. Rao and L. Reddy. 2012. Survey on improved scheduling in Hadoop MapReduce in cloud environments. arXiv preprint arXiv:1207.0780.Google Scholar
- L. Salmon, C. Ray, and C. Claramunt. 2015. A hybrid approach combining real-time and archived data for mobility analysis. In Proceedings of the 6th ACM SIGSPATIAL International Workshop on GeoStreaming. ACM, 43--48.Google Scholar
- K.-U. Sattler and F. Beier. 2013. Towards elastic stream processing: Patterns and infrastructure. In Proceedings of the 1st International Workshop on Big Dynamic Distributed Data (BD3’13). Citeseer, 49.Google Scholar
- E. Saurez, K. Hong, D. Lillethun, U. Ramachandran, and B. Ottenwalder. 2016. Incremental deployment and migration of geo-distributed situation awareness applications in the fog. In Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems. ACM, 258--269.Google Scholar
- W. Shi, J. Cao, Q. Zhang, Y. Li, and L. Xu. 2016. Edge Computing: Vision and Challenges. IEEE IoT J. 3 (2016), 637--646.Google Scholar
Cross Ref
- R. Tolosana-Calasanz, J. Á. Ba Ares, C. Pham, and O. F. Rana. 2016. Resource management for bursty streams on multi-tenancy cloud environments. Fut. Gener. Comput. Syst. 55 (2016), 444--459.Google Scholar
Digital Library
- S. Yangui, P. Ravindran, O. Bibani, R. H. Glitho, N. B. Hadj-Alouane, M. J. Morrow, and P. A. Polakos. 2016. A platform as-a-service for hybrid cloud/fog environments. In Proceedings of the 2016 IEEE International Symposium on Local and Metropolitan Area Networks (LANMAN’16), 1--7.Google Scholar
- W. Yu, F. Liang, X. He, W. G. Hatcher, C. Lu, J. Lin, and X. Yang. 2018. A survey on the edge computing for the Internet of Things. IEEE Access 6 (2018), 6900--6919.Google Scholar
Cross Ref
- M. Zaharia. 2014. An architecture for fast and general data processing on large clusters. Doctor of Philosophy in Computer Science, Electrical Engineering and Computer Sciences, University of California at Berkeley.Google Scholar
- W. Zhang, H. Lv, L. Xu, Y. Liu, X. Liu, Q. Lu, Z. Li, and J. Zhou. 2017. An Online-Offline Combined Big Data Mining Platform. In Proceedings of the IEEE 14th International Conference on Dependable, Autonomic and Secure Computing, 14th International Conference on Pervasive Intelligence and Computing, 2nd International Conference on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech’17). 1220--1225.Google Scholar
- Z. Zhao, W. Ding, J. Wang, and Y. Han. 2015. A hybrid processing system for large-scale traffic sensor data. IEEE Access 3 (2015), 2341--2351.Google Scholar
Cross Ref
- J. Y. Zhu, J. Xu, and V. O. K. Li. 2016. A Four-Layer Architecture for Online and Historical Big Data Analytics. In Proceedings of the IEEE 14th International Conference on Dependable, Autonomic and Secure Computing, 14th International Conference on Pervasive Intelligence and Computing, 2nd International Conference on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech 2016). IEEE, 634--639.Google Scholar
Index Terms
Task Allocation in Hybrid Big Data Analytics for Urban IoT Applications
Recommendations
The role of big data analytics in Internet of Things
The explosive growth in the number of devices connected to the Internet of Things (IoT) and the exponential increase in data consumption only reflect how the growth of big data perfectly overlaps with that of IoT. The management of big data in a ...
From IoT big data to IoT big services
SAC '17: Proceedings of the Symposium on Applied ComputingThe large-scale deployments of Internet of Things (IoT) systems have introduced several new challenges in terms of processing their data. The massive amount of IoT-generated data requires design solutions to speed up data processing, scale up with the ...






Comments