skip to main content
research-article

eDiaPredict: An Ensemble-based Framework for Diabetes Prediction

Authors Info & Claims
Published:14 June 2021Publication History
Skip Abstract Section

Abstract

Medical systems incorporate modern computational intelligence in healthcare. Machine learning techniques are applied to predict the onset and reoccurrence of the disease, identify biomarkers for survivability analysis depending upon certain health conditions of the patient. Early prediction of diseases like diabetes is essential as the number of diabetic patients of all age groups is increasing rapidly. To identify underlying reasons for the onset of diabetes in its early stage has become a challenging task for medical practitioners. Continuously increasing diabetic patient data has necessitated for the applications of efficient machine learning algorithms, which learns from the trends of the underlying data and recognizes the critical conditions in patients. In this article, an ensemble-based framework named eDiaPredict is proposed. It uses ensemble modeling, which includes an ensemble of different machine learning algorithms comprising XGBoost, Random Forest, Support Vector Machine, Neural Network, and Decision tree to predict diabetes status among patients. The performance of eDiaPredict has been evaluated using various performance parameters like accuracy, sensitivity, specificity, Gini Index, precision, area under curve, area under convex hull, minimum error rate, and minimum weighted coefficient. The effectiveness of the proposed approach is shown by its application on the PIMA Indian diabetes dataset wherein an accuracy of 95% is achieved.

References

  1. Chitra Jegan, V. Anuja Kumari, and R. Chitra. 2018. Classification of diabetes disease using support vectormachine. Int. J. Eng. Res. Appl. 3, 2 (2018), 1797–1801. Retrieved from https://www.researchgate.net/publication/320395340.Google ScholarGoogle Scholar
  2. Parampreet Kaur, Neha Sharma, Ashima Singh, and Bob Gill. 2019. CI-DPF: A cloud IoT based framework for diabetes prediction. In Proceedings of the 2018 IEEE 9th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON’18), 654–660. DOI:https://doi.org/10.1109/IEMCON.2018.8614775Google ScholarGoogle Scholar
  3. Kevin Plis, Razvan Bunescu, Cindy Marling, Jay Shubrook, and Frank Schwartz. 2014. A Machine Learning Approach to Predicting Blood Glucose Levels for Diabetes. AAAI Workshop Technical Report WS-14-08 (2014), 35–39.Google ScholarGoogle Scholar
  4. Tao Zheng, Wei Xie, Liling Xu, Xiaoying He, Ya Zhang, Mingrong You, Gong Yang, and You Chen. 2017. A machine learning-based framework to identify type 2 diabetes through electronic health records. Int. J. Med. Inform 97, (2017) 120–127. DOI:https://doi.org/10.1016/j.ijmedinf.2016.09.014Google ScholarGoogle ScholarCross RefCross Ref
  5. Ambika Choudhury and Deepak Gupta. 2019. Recent Developments in Machine Learning and Data Analytics. Springer Singapore. DOI:https://doi.org/10.1007/978-981-13-1280-9Google ScholarGoogle Scholar
  6. Radia Belkeziz and Zahi Jarir. 2017. A survey on internet of things coordination. In Proceedings of the 2016 3rd International Conference on Systems of Collaboration (SysCo’16), 619–635. DOI:https://doi.org/10.1109/SYSCO.2016.7831328Google ScholarGoogle Scholar
  7. M. S. Hossain. 2017. Cloud-supported cyber–physical localization framework for patients monitoring. IEEE Syst J. 11, 1 (2017), 118--127. DOI:10.1109/JSYST.2015.2470644Google ScholarGoogle ScholarCross RefCross Ref
  8. Usha Devi Gandhi, Priyan Malarvizhi Kumar, R. Varatharajan, Gunasekaran Manogaran, Revathi Sundarasekar, and Shreyas Kadu. 2018. HIoTPOT: Surveillance on IoT devices against recent threats. Wireless Pers. Commun. 103, 2 (2018), 1179–1194. DOI:https://doi.org/10.1007/s11277-018-5307-3 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Quan Zou, Kaiyang Qu, Yamei Luo, Dehui Yin, Ying Ju, and Hua Tang. 2018. Predicting diabetes mellitus with machine learning techniques. Front. Genet. 9, (2018) 1–10. DOI:https://doi.org/10.3389/fgene.2018.00515Google ScholarGoogle Scholar
  10. V. Veena Vijayan and C. Anjali. 2016. Prediction and diagnosis of diabetes mellitus—A machine learning approach. In Proceedings of the 2015 IEEE Recent Advances in Intelligent Computational Systems (RAICS’15), 122–127. DOI:https://doi.org/10.1109/RAICS.2015.7488400Google ScholarGoogle ScholarCross RefCross Ref
  11. S. U. Amin et al. 2019. Cognitive smart healthcare for pathology detection and monitoring. IEEE Access. 7 (2019), 10745--10753. DOI:10.1109/ACCESS.2019.2891390Google ScholarGoogle ScholarCross RefCross Ref
  12. Khyati K. Gandhi and Nilesh B. Prajapati. 2014. Diabetes prediction using feature selection and classification. Int. J. Adv. Eng. Res. Dev 1, 05 (2014), 1–7. DOI:https://doi.org/10.21090/ijaerd.0105110Google ScholarGoogle Scholar
  13. Madhuri Panwar, Amit Acharyya, Rishad A. Shafik, and Dwaipayan Biswas. 2017. K-nearest neighbor based methodology for accurate diagnosis of diabetes mellitus. In Proceedings of the 2016 6th International Symposium on Embedded Computing and System Design (ISED’16), 132–136. DOI:https://doi.org/10.1109/ISED.2016.7977069Google ScholarGoogle Scholar
  14. K. Sowjanya, Ayush Singhal, and Chaitali Choudhary. 2015. MobDBTest: A machine learning based system for predicting diabetes risk using mobile devices. In Proceedings of the Souvenir 2015 IEEE International Advanced Computing Conference (IACC’15), 397–402. DOI:https://doi.org/10.1109/IADCC.2015.7154738Google ScholarGoogle ScholarCross RefCross Ref
  15. Emrana Kabir Hashi, Md Shahid Uz Zaman, and Md Rokibul Hasan. 2017. An expert clinical decision support system to predict disease using classification techniques. In Proceedings of the International Conference Electrical Computer and Communications Engineering ECCE 2017.(2017), 396–400. DOI:https://doi.org/10.1109/ECACE.2017.7912937Google ScholarGoogle ScholarCross RefCross Ref
  16. H. Balaji, N. Ch. S. N. Iyengar, and Ronnie D. Caytiles. 2017. Optimal predictive analytics of pima diabetics using deep learning. Int. J. Database Theory Appl.10, 9 (2017), 47–62. DOI:https://doi.org/10.14257/ijdta.2017.10.9.05Google ScholarGoogle ScholarCross RefCross Ref
  17. S. Srivastava, L. Sharma, V. Sharma, A. Kumar, A. and H. Darbari. 2019. Prediction of diabetes using artificial neural network approach. In Engineering Vibration, Communication and Information Processing. Springer, Singapore, 679–687.Google ScholarGoogle Scholar
  18. Sajida Perveen, Muhammad Shahbaz, Aziz Guergachi, and Karim Keshavjee. 2016. Performance analysis of data mining classification techniques to predict diabetes. Proc. Comput. Sci. 82, (2016) 115–121.Google ScholarGoogle ScholarCross RefCross Ref
  19. Ayush Anand and Divya Shakti. 2016. Prediction of diabetes based on personal lifestyle indicators. In Proceedings of the 2015 1st International Conference on Next Generation Computing Technologies (NGCT’15), 673–676. DOI:https://doi.org/10.1109/NGCT.2015.7375206Google ScholarGoogle Scholar
  20. Shivani Jakhmola and Tribikram Pradhan. 2015. A computational approach of data smoothening and prediction of diabetes dataset. ACM Intnational Conference Proceeding Series, 744–748. DOI:https://doi.org/10.1145/2791405.2791572 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. A. A. Jarullah. 2011. Decision tree discovery for the diagnosis of type II diabetes. In Proceedings of the 2011 International Conference on Innovations in Information Technology. IEEE.Google ScholarGoogle ScholarCross RefCross Ref
  22. Ahmed Hamza and Hani Moetque. 2017. Diabetes disease diagnosis method based on feature extraction using K-SVM. Int. J. Adv. Comput. Sci. Appl 8, 1 (2017), 236–244. DOI:https://doi.org/10.14569/ijacsa.2017.080130Google ScholarGoogle Scholar
  23. Mahmoud Heydari, Mehdi Teimouri, Zainabolhoda Heshmati, and Seyed Mohammad Alavinia. 2016. Comparison of various classification algorithms in the diagnosis of type 2 diabetes in Iran. Int. J. Diabetes Dev. Ctries. 36, 2 (2016), 167–173. DOI:https://doi.org/10.1007/s13410-015-0374-4Google ScholarGoogle ScholarCross RefCross Ref
  24. Messan Komi, Jun Li, Yongxin Zhai, and Zhang Xianguo. 2017. Application of data mining methods in diabetes prediction. In Proceedings of the 2nd International Conference on Image, Vision and Computing (ICIVC’17), 1006–1010.Google ScholarGoogle ScholarCross RefCross Ref
  25. A. Swain, S. N. Mohanty, and A. C. Das. 2016. Comparative risk analysis on prediction of diabetes mellitus using machine learning approach. In Proceedings of the 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT’16).Google ScholarGoogle Scholar
  26. N. Douali, J. Dollon, and M. Jaulent. 2015. Personalized prediction of gestational Diabetes using a clinical decision support system. In Proceedings of the IEEE International Conference on Fuzzy Systems (FUZZ-IEEE'15). 1--5. DOI:10.1109/FUZZ-IEEE.2015.7337813Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Nitin Bhatia and Sangeet Kumar. 2015. Prediction of severity of diabetes mellitus using fuzzy cognitive maps. Life Sci. Adv. Tech. 29 (2015), 71–79.Google ScholarGoogle Scholar
  28. Han Wu, Shengqi Yang, Zhangqin Huang, Jian He, and Xiaoyi Wang. 2018. Type 2 diabetes mellitus prediction model based on data mining. Informat. Med. Unlocked 10, (2018), 100–107.Google ScholarGoogle ScholarCross RefCross Ref
  29. Mehrbakhsh Nilashi, Othman bin Ibrahim, Hossein Ahmadi, and Leila Shahmoradi. 2017. An analytical method for diseases prediction using machine learning techniques. Comput. Chem. Eng. 106, (2017), 212–223.Google ScholarGoogle Scholar
  30. WDBC. Retrieved 2019 from https://datahub.io/machine-learning/wdbc.Google ScholarGoogle Scholar
  31. AdilHusain and Muneeb Khan. 2018. Early diabetes prediction using voting based ensemble learning. In Proceedings of the International Conference on Advances in Computing and Data Sciences, Springer, Singapore. 2018, 95–103.Google ScholarGoogle Scholar
  32. S. Rasoul Safavian and David Landgrebe. 1991. A survey of decision tree classifier methodology. IEEE Trans. Syst. Man Cybernet. 21, 3 (1991), 660–674. DOI:https://doi.org/10.1109/21.97458Google ScholarGoogle ScholarCross RefCross Ref
  33. Mohamed Ahmed Ahmed, Ahmet Rizaner, and Hakan Ulusoy Ali. 2018. A novel decision tree classification based on post-pruning with Bayes minimum risk. PLoS One 13, 4 (2018), 1–12. DOI:https://doi.org/10.1371/journal.pone.0194168Google ScholarGoogle ScholarCross RefCross Ref
  34. C. Cortes and V. Vapnik. 1995. Support-vector networks. Mach. Learn. 20, 3 (1995), 273–297. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. M. S. Hossain, S. U. Amin, M. Alsulaiman, and G. Muhammad. 2019. Applying deep learning for epilepsy seizure detection and brain mapping visualization. ACM Trans. Multimed. Comput. Commun. Appl 15, 1 (2019), 1--17. DOI:10.1145/3241056 Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. S. U. Amin et al. 2019. Deep Learning for EEG motor imagery classification based on multi-layer CNNs feature fusion. Future Gener Comput Syst. 101 (2019), 542--554. DOI:10.1016/j.future.2019.06.027Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. What Is Correlation. Retrieved 2019 from https://www.displayr.com/what-is-correlation/.Google ScholarGoogle Scholar
  38. Arwinder Dhillon, Ashima Singh 2019. Mach. Learn. Healthcare. 8, (July 2019), 92–109.Google ScholarGoogle Scholar
  39. Diseases Conditions. Retrieved 2019 from https://www.mayoclinic.org/diseases-conditions/diabetes/diagnosis-treatment/drc-20371451.Google ScholarGoogle Scholar
  40. Ensemble Learning to Improve Machine Learning Results. Retreived 2019 from https://blog.statsbot.co/ensemble-learning-d1dcd548e936.Google ScholarGoogle Scholar
  41. Gestational Diabetes and Pregnancy. Retrieved 2019 from https://www.cdc.gov/pregnancy/diabetes-gestational.html.Google ScholarGoogle Scholar
  42. How Does a Continuous Glucose Monitor Work? Retrieved 2019 from https://www.webmd.com/diabetes/guide/continuous-glucose-monitoring#1.Google ScholarGoogle Scholar
  43. Decision Tree Classification in Python. Retrieved 2020 from https://www.datacamp.com/community/tutorials/decision-tree-classification-python.Google ScholarGoogle Scholar
  44. Feature Selection Is Python—Recursive Feature Elimination. Retreived 2020 from https://towardsdatascience.com/feature-selection-in-python-recursive-feature-elimination-19f1c39b8d15.Google ScholarGoogle Scholar
  45. M. Chen, J. Yang, L. Hu, M. S. Hossain, and G. Muhammad. 2018. Urban Healthcare Big Data System Based on Crowdsourced and Cloud-Based Air Quality Indicators. IEEE Commun. Mag. 56, 11 (2018), 14--20. DOI:10.1109/MCOM.2018.1700571Google ScholarGoogle ScholarCross RefCross Ref
  46. Gagangeet Singh Aujla, Anish Jindal, Rajat Chaudhary, Neeraj Kumar, Sahil Vashist, Neeraj Sharma, and Mohammad S. Obaidat. 2019. DLRS: Deep learning-based recommender system for smart healthcare ecosystem. In Proceedings of the IEEE International Conference on Communications. DOI:https://doi.org/10.1109/ICC.2019.8761416Google ScholarGoogle Scholar
  47. Pratt. 2018. Anti-drug antibodies: emerging approaches to predict, reduce or reverse biotherapeutic immunogenicity. Antibodies 7, 2 (2018), 19. DOI:https://doi.org/10.1142/S0219720018500178Google ScholarGoogle ScholarCross RefCross Ref
  48. Arwinder Dhillon and Ashima Singh. 2020. eBreCaP: Extreme learning based model for breast cancer survival prediction. IET Sys. Biol. (2020), 12. DOI:https://doi.org/10.1049/iet-syb.2019.0087Google ScholarGoogle Scholar
  49. Parampreet Kaur, Ashima Singh, and Inderveer Chana, 2021. Computational techniques and tools for omics data analysis: State-of-the-art, challenges, and future directions. Arch. Computat. Methods Eng. (2021). DOI:https://doi.org/10.1007/s11831-021-09547-0Google ScholarGoogle Scholar
  50. G. Muhammad, M. S. Hossain, and N. Kumar. 2021. EEG-based pathology detection for home health monitoring. IEEE J. Sel. Areas Commun. 39, 2 (2021), 603--610. DOI:10.1109/JSAC.2020.3020654Google ScholarGoogle ScholarCross RefCross Ref
  51. Neha Sharma and Ashima Singh. 2018. Diabetes detection and prediction using machine learning/IoT: A survey. In Proceedings of the IEEE International Conference on Advanced Informatics for Computing Research, Springer, Singapore, (2018), 471–479. DOI:https://doi.org/10.1007/978-981-13-3140-4_42Google ScholarGoogle Scholar
  52. Thinking Before Building: XGBoost Parallelization. Retreived 2020 from https://medium.com/blablacar-tech/thinking-before-building-xgboost-parallelization-f1a3f37b6e68.Google ScholarGoogle Scholar
  53. Arwinder Dhillon, Ashima Singh, Harpreet Vohra, Caroline Ellis, Blesson Varghese, and Sukhpal Singh Gill. 2020. IoTPulse: Machine learning-based enterprise health information system to predict alcohol addiction in Punjab (India) using IoT and fog computing. Enter. Inform. Sys. (2020), 1–33. DOI:https://doi.org/10.1080/17517575.2020.1820583Google ScholarGoogle Scholar
  54. How XGBoost Works. Retreived 2020 from https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost-HowItWorks.html.Google ScholarGoogle Scholar
  55. PIMA INDIAN DIABETES. Retreived 2019 from https://www.kaggle.com/rnmehta5/pima-indian-diabetes-binary-classification.Google ScholarGoogle Scholar
  56. Emsemble Methods. Retreived 2020 from https://www.toptal.com/machine-learning/ensemble-methods-machine-learning.Google ScholarGoogle Scholar

Index Terms

  1. eDiaPredict: An Ensemble-based Framework for Diabetes Prediction

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 17, Issue 2s
      June 2021
      349 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3465440
      Issue’s Table of Contents

      Copyright © 2021 Association for Computing Machinery.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 14 June 2021
      • Accepted: 1 August 2020
      • Revised: 1 July 2020
      • Received: 1 January 2020
      Published in tomm Volume 17, Issue 2s

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!