Abstract
The combination of Internet of Things (IoT) and Edge Computing (EC) can assist in the delivery of novel applications that will facilitate end-users’ activities. Data collected by numerous devices present in the IoT infrastructure can be hosted into a set of EC nodes becoming the subject of processing tasks for the provision of analytics. Analytics are derived as the result of various queries defined by end-users or applications. Such queries can be executed in the available EC nodes to limit the latency in the provision of responses. In this article, we propose a meta-ensemble learning scheme that supports the decision making for the allocation of queries to the appropriate EC nodes. Our learning model decides over queries’ and nodes’ characteristics. We provide the description of a matching process between queries and nodes after concluding the contextual information for each envisioned characteristic adopted in our meta-ensemble scheme. We rely on widely known ensemble models, combine them, and offer an additional processing layer to increase the performance. The aim is to result a subset of EC nodes that will host each incoming query. Apart from the description of the proposed model, we report on its evaluation and the corresponding results. Through a large set of experiments and a numerical analysis, we aim at revealing the pros and cons of the proposed scheme.
- A. Abouzeid, K. Bajda-Pawlikowski, D. J. Abadi, A. Rasin, and A. Silberschatz. 2009. HadoopDB: An architectural hybrid of MapReduce and DBMS technologies for analytical workloads. PVLDB 2, 1 (2009).Google Scholar
- R. Ade and P. R. Deshmukh. 2014. An incremental ensemble of classifiers as a technique for prediction of student’s career choice. In Proceedings of the 1st International Conference on Networks 8 Soft Computing (ICNSC’14).Google Scholar
- S. Agarwal, H. Milner, A. Kleiner, A. Talwalkar, M. Jordan, S. Madden, B. Mozafari, I. Stoica. 2014. Knowing when you’re wrong: Building fast and reliable approximate query processing systems. In Proceedings of the ACM SIGMOD International Conference on Management of Data.Google Scholar
Digital Library
- M. Akay, C. Aci, and F. Abut. 2015. Predicting the performance measures of a 2-dimensional message passing multiprocessor architecture by using machine learning methods. Neur. Netw. World 71, 5 (2015), 1907--1931.Google Scholar
- E. L. Allwein, R. E. Schapire, and Y. Singer. 2000. Reducing multiclass to binary: A unifying approach for margin classifiers. J. Mach. Learn. Res. 1 (2000), 113--141.Google Scholar
Digital Library
- H. Artail, H. El Amine, and F. Sakkal. 2008. SQL query space and time complexity estimation for multidimensional queries. Int. J. Intell. Inf. Datab. Syst. 2, 4 (2008), 460--480.Google Scholar
Digital Library
- M. H. A. Awadalla. 2013. Task mapping and scheduling in wireless sensor networks. Int. J. Comput. Sci. 440, 4 (2013).Google Scholar
- S. Bharti and K. Pattanaik. 2016. Task requirement aware pre-processing and scheduling for IoT sensory environments. Ad Hoc Netw. 50 (2016), 102--114.Google Scholar
Digital Library
- M. Blachnik. 2014. Ensembles of instance selection methods based on feature subset. Procedia Comput. Sci. 35 (2014), 388--396.Google Scholar
Cross Ref
- L. Breiman. 1996. Bagging predictors. Mach. Learn. 24, 2 (1996), 123--140.Google Scholar
Digital Library
- L. Breiman. 1998. Arcing classifiers. Annals Statist. 26, 3 (1998), 801--849.Google Scholar
Cross Ref
- P. Bullen. 2003. Quasi-arithmetic means. Handbook of Means and Their Inequalities. Springer, 266--320.Google Scholar
- B. Chandramouli, J. Goldstein, and A. Quamar. 2013. Scalable progressive analytics on big data in the cloud. Proc. VLDB Endow. 6, 14 (2013).Google Scholar
- S. Chaudhuri, G. Das, and U. Srivastava. 2004. Effective use of block-level sampling in statistics estimation. In Proceedings of the ACM SIGMOD International Conference on Management of Data.Google Scholar
- Z. Chen, W. Hu, J. Wang, S. Zhao, B. Amos, G. Wu, K. Ha, K. Elgazzar, P. Pillai, R. Klatzky, D. Siewiorek, and M. Satyanarayanan. 2017. An empirical study of latency in an emerging class of edge computing applications for wearable cognitive assistance. In Proceedings of the 2nd ACM/IEEE Symposium on Edge Computing.Google Scholar
- B. Coltin and N. Veloso. 2010. Mobile robot task allocation in hybrid wireless sensors networks. In Proceedings of the International Conference on Intelligent Robots and Systems.Google Scholar
- T. Condie, N. Conway, P. Alvaro, J. M. Hellerstein, K. Elmeleegy, and R. Sears. 2010. MapReduce online. In Proceedings of the 7th Conference on Networked Systems Design and Implementation.Google Scholar
- H. Daga, P. Nicholson, A. Gavrilovska, and D. Lugones. 2019. Cartel: A system for collaborative transfer learning at the edge. In Proceedings of the ACM Symposium on Cloud Computing. 25--37.Google Scholar
- T. Dietterich. 2000. Ensemble methods in machine learning. InProceedings of the Multiple Classifier Systems Conference (MCS’00), (Lecture Notes in Computer Science), Vol. 1857. Springer, Berlin.Google Scholar
Cross Ref
- J. Dittrich, J. A. Quiane-Ruiz, A. Jindal, Y. Kargin, V. Setty, and J. Schad. 2010. Hadoop++: Making a yellow elephant run like a cheetah. PVLDB 3, 1 (2010).Google Scholar
- A. Doucet, M. Briers, and S. Senecal. 2006. Efficient block sampling strategies for sequential Monte Carlo methods. J. Comput. Graph. Statist. 15, 3 (2006), 693--711.Google Scholar
Cross Ref
- S. Dzeroski and B. Zenko. 2004. Is combining classifiers with stacking better than selecting the best one? Mach. Learn. 54 (2004), 255--273.Google Scholar
Digital Library
- N. Edalat, W. Xiao, C.-K. Tham, E. Keikha, and L.-L. Ong. 2009. A price-based adaptive task allocation for Wireless Sensor Network. In Proceedings of the 6th International Conference on Mobile Adhoc and Sensor Systems (ICMASS'09).Google Scholar
Cross Ref
- F. Farahbod and N. Eftekhari. 2012. Comparison of different T-norm operators in classification problems. Int. J. Fuzz. Log. Syst. 2, 3 (2012).Google Scholar
- Y. Freund and R. E. Schapire. 1997. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55, 1 (1997), 119--139.Google Scholar
Digital Library
- M. Galar, A. Fernandez, E. Barrenechea, H. Bustince, and F. Herrera. 2011. Aggregation Schemes for Binarization Techniques. Methods’ Description. Technical Report, Research Group on Soft Computing and Intelligent Information Systems, 2011. http://sci2s.ugr.es/ovo-ova/AggregationMethodsDescription.pdf.Google Scholar
- S. Geman, E. Bienenstock, and R. Doursat. 1992. Neural networks and the bias-variance dilemma. Neur. Comput. 4, 1 (1992), 1--58.Google Scholar
Digital Library
- M. Gualtieri and N. Yuhanna. 2014. The Forrester Wave: Big Data Hadoop Solutions. Technical Report. Forrester.Google Scholar
- M. Haßler, S. Jeschke, and T. Meisen. 2017. Similarity analysis of time interval data sets regarding time shifts and rescaling. In Proceedings of the International Conference on Time Series.Google Scholar
- A. Hameurlain and F. Morvan. 2009. Evolution of query optimization methods. Trans. Large-Scale Data- Knowl.-cent. Syst. 5740 (2009), 211--242.Google Scholar
- J. Han, M. Kamber, and J. Pei. 2012. Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers.Google Scholar
- J. M. Hellerstein and R. Avnur. 2000. Informix under control: Online query processing. Data Mining Knowl. Disc. J. 4 (2000).Google Scholar
- H. Herodotou, H. Lim, G. Luo, N. Borisov, L. Dong, F. B. Cetin, and S. Babu. 2011. Starfish: A self-tuning system for big data analytics. In Proceedings of the Conference on Innovative Data Systems Research.Google Scholar
- K. Hossain, Z. Raihan, and M. Hashem. 2005. On appropriate selection of fuzzy aggregation operators in medical decision support system. In Proceedings of the 8th International Conference on Computer and Information Technology.Google Scholar
- X. Hu and B. Xu. 2011. Task allocation mechanism based on genetic algorithm in wireless sensor networks. In Proceedings of the International Conference on Applied Informatics and Communication.Google Scholar
- C. Jermaine, S. Arumugam, A. Pol, and A. Dobra. 2007. Scalable approximate query processing with the DBO engine. In Proceedings of the ACM SIGMOD International Conference on Management of Data.Google Scholar
- D. Jiang, D. C. Ooi, L. Shi, and S. Wu. 2010. The performance of MapReduce: An in-depth study. PVLDB 3, 1 (2010).Google Scholar
- E. Kleinberg. 2000. A mathematically rigorous foundation for supervised learning. In Proceedings of the 1st International Multiple Classifier Systems Workshop (MCS’00). Springer-Verlag, 67--76.Google Scholar
Cross Ref
- K. Kolomvatsos. 2018. An intelligent scheme for assigning queries. Appl. Intelligence 48 (2018) 2730--2745. DOI:https://doi.org/10.1007/s10489-017-1099-5, 2018.Google Scholar
Digital Library
- K. Kolomvatsos and C. Anagnostopoulos. 2018. An edge-centric ensemble scheme for queries assignment. In Proceedings of the 8th International Workshop on Combinations of Intelligent Methods and Applications.Google Scholar
- K. Kolomvatsos and C. Anagnostopoulos. 2017. Reinforcement machine learning for predictive analytics in smart cities. Informatics 4, 16 (2017).Google Scholar
- K. Kolomvatsos, C. Anagnostopoulos, and S. Hadjiefthymiades. 2015. A time optimized scheme for top-k list maintenance over incomplete data streams. Inf. Sci. 311 (2015), 59--73.Google Scholar
Digital Library
- K. Kolomvatsos and S. Hadjiefthymiades. 2017. Learning the engagement of query processors for intelligent analytics. Appl. Intell. 46 (2017), 96--112.Google Scholar
Digital Library
- E. Kong and T. Dietterich. 1995. Error—Correcting output coding correct bias and variance. In Proceedings of the International Conference on Machine Learning. 313--321.Google Scholar
- G. Kul, D. T. A. Luong, T. Xie, V. Chandola, O. Kennedy, and S. Ypadhyaya. 2018. Similarity measures for SQL query clustering. IEEE Trans. Knowl. Data Eng. 30, 12 (2018).Google Scholar
Digital Library
- L. Lam and C. Sue. 1997. Application of majority voting to pattern recognition: An analysis of its behavior and performance. IEEE Trans. Syst. Man Cyber. 27, 5 (1997), 553--568.Google Scholar
Digital Library
- D. Logothetis and K. Yocum. 2008. Ad hoc data processing in the cloud. Proc. VLDB Endow. 1, 2 (2008), 1472--1475.Google Scholar
Digital Library
- A. Lübcke, V. Köppen, and G. Saake. 2011. A Query Decomposition Approach for Relational DBMS Using Different Storage Architectures. Technical Report. Elektronische Zeitschriftenreihe der Fakultät für Informatik der Otto-von-Guericke-Universität Magdeburg.Google Scholar
- C. Manning, P. Raghavan, and H. Schutze. 2009. An Introduction to Information Retrieval. Cambridge University Press.Google Scholar
- L. Mason, P. Bartlett, and J. Baxter. 2000. Improved generalization through explicit optimization of margins. Mach. Learn. 32 (2000).Google Scholar
- R. Moulton and Y. Jiang. 2018. Maximally consistent sampling and the Jaccard index of probability distributions. In Proceedings of the International Conference on Data Mining, Workshop on High Dimensional Data Mining.Google Scholar
- V. Moysiadis, P. Sarigannidis, and I. Moscholios. 2018. Towards distributed data management in fog computing. Wirel. Commun. Mob. Comput., Vol. 2018, ID 759686.Google Scholar
Digital Library
- S. Pandit and S. Gupta. 2011. A comparative study on distance measuring approaches for clustering. Int. J. Res. Comput. Sci. 2, 1 (2011), 29--31.Google Scholar
Cross Ref
- V. Raman, B. Raman, and J. M. Hellerstein. 1999. Online dynamic reordering for interactive data processing. In Proceedings of the 25th International Conference on Very Large Data Bases (VLDB'99). 709--720.Google Scholar
- A. Razavinegad. 2014. Task allocation in robot mobile wireless sensor networks. Int. J. Sci. Technol. Res. 3, 6 (2014).Google Scholar
- M. Re and G. Valentini. 2012. Ensemble methods: A review. In Advances in Machine Learning and Data Mining for Astronomy, Data Mining and Knowledge Discovery. Chapman 8 Hall, 563--594.Google Scholar
- R. Rifkin and A. Klautau. 2004. In defense of one-vs-all classification. J. Mach. Learn. Res. 5 (2004), 101--141.Google Scholar
Digital Library
- R. Schapire, Y. Freund, P. Bartlett, and W. Lee. 1998. Boosting the margin: A new explanation for the effectiveness of voting methods. Ann. Statist. 26, 5 (1998), 1651--1686.Google Scholar
Cross Ref
- S. Singh and N. Singh. 2012. Big data analytics. In Proceedings of the International Conference on Communication, Information and Computing Technology.Google Scholar
- M. Simon and N. Pataki. 2010. SQL code complexity analysis. In Proceedings of the 8th International Conference of Applied Informatics.Google Scholar
- Y. Tian, E. Ekici, and F. Ozguner. 2005. Energy-constrained task mapping and scheduling in wireless sensor networks. In Proceedings of the IEEE International Conference on Mobile Adhoc and Sensor Systems (ICMASS'05).Google Scholar
- A. Vashistha and S. Jain. 2016. Measuring query complexity in SQLShare workload. In Proceedings of the ACM SIGMOD International Conference on Management of Data.Google Scholar
- Z. Wen, D. Le Quoc, P. Bhatotia, R. Chen, and M. Lee. 2018. ApproxIoT: Approximate analytics for edge computing. In Proceedings of the 38th International Conference for Edge Computing.Google Scholar
- D. H. Wolpert. 1992. Stacked generalization. Neur. Netw. 5, 2 (1992), 241--259.Google Scholar
Digital Library
- J. Yang, H. Zhang, Y. Ling, C. Pan, and W. Sun. 2014. Task allocation for wireless sensor network using modified binary particle swarm optimization. IEEE Sens. J. 14, 3 (2014), 882--892.Google Scholar
Cross Ref
- Q. Yongrui, Q. Z. Sheng, N. J. G. Falkner, S. Dustdar, H. Wang, and A. V. Vasilakos. 2016. When things matter: A survey on data-centric internet of things. J. Netw. Comput. Applic. 64 (2016), 137--153.Google Scholar
Digital Library
- Y. Yu and V. Prasanna. 2005. Energy-balanced task allocation for collaborative processing in wireless sensor networks. Mob. Netw. Applic. 10, 1--2 (2005), 115--131.Google Scholar
Cross Ref
Index Terms
An Intelligent Edge-centric Queries Allocation Scheme based on Ensemble Models
Recommendations
A self-adaptable query allocation framework for distributed information systems
In large-scale distributed information systems, where participants are autonomous and have special interests for some queries, query allocation is a challenge. Much work in this context has focused on distributing queries among providers in a way that ...
View-based query processing for regular path queries with inverse
PODS '00: Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systemsView-based query processing is the problem of computing the answer to a query based on a set of materialized views, rather than on the raw data in the database. The problem comes in two different forms, called query rewriting and query answering, ...
A hyperbolic routing scheme for information-centric internet of things with edge computing
AbstractInternet of Things (IoT) provides the opportunity to access devices at any time by connecting hundreds of billions of devices. However, routing among the massive number of devices can be a huge challenge. The increasing network size and dynamics ...






Comments