Abstract
Medical systems incorporate modern computational intelligence in healthcare. Machine learning techniques are applied to predict the onset and reoccurrence of the disease, identify biomarkers for survivability analysis depending upon certain health conditions of the patient. Early prediction of diseases like diabetes is essential as the number of diabetic patients of all age groups is increasing rapidly. To identify underlying reasons for the onset of diabetes in its early stage has become a challenging task for medical practitioners. Continuously increasing diabetic patient data has necessitated for the applications of efficient machine learning algorithms, which learns from the trends of the underlying data and recognizes the critical conditions in patients. In this article, an ensemble-based framework named eDiaPredict is proposed. It uses ensemble modeling, which includes an ensemble of different machine learning algorithms comprising XGBoost, Random Forest, Support Vector Machine, Neural Network, and Decision tree to predict diabetes status among patients. The performance of eDiaPredict has been evaluated using various performance parameters like accuracy, sensitivity, specificity, Gini Index, precision, area under curve, area under convex hull, minimum error rate, and minimum weighted coefficient. The effectiveness of the proposed approach is shown by its application on the PIMA Indian diabetes dataset wherein an accuracy of 95% is achieved.
- Chitra Jegan, V. Anuja Kumari, and R. Chitra. 2018. Classification of diabetes disease using support vectormachine. Int. J. Eng. Res. Appl. 3, 2 (2018), 1797–1801. Retrieved from https://www.researchgate.net/publication/320395340.Google Scholar
- Parampreet Kaur, Neha Sharma, Ashima Singh, and Bob Gill. 2019. CI-DPF: A cloud IoT based framework for diabetes prediction. In Proceedings of the 2018 IEEE 9th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON’18), 654–660. DOI:https://doi.org/10.1109/IEMCON.2018.8614775Google Scholar
- Kevin Plis, Razvan Bunescu, Cindy Marling, Jay Shubrook, and Frank Schwartz. 2014. A Machine Learning Approach to Predicting Blood Glucose Levels for Diabetes. AAAI Workshop Technical Report WS-14-08 (2014), 35–39.Google Scholar
- Tao Zheng, Wei Xie, Liling Xu, Xiaoying He, Ya Zhang, Mingrong You, Gong Yang, and You Chen. 2017. A machine learning-based framework to identify type 2 diabetes through electronic health records. Int. J. Med. Inform 97, (2017) 120–127. DOI:https://doi.org/10.1016/j.ijmedinf.2016.09.014Google Scholar
Cross Ref
- Ambika Choudhury and Deepak Gupta. 2019. Recent Developments in Machine Learning and Data Analytics. Springer Singapore. DOI:https://doi.org/10.1007/978-981-13-1280-9Google Scholar
- Radia Belkeziz and Zahi Jarir. 2017. A survey on internet of things coordination. In Proceedings of the 2016 3rd International Conference on Systems of Collaboration (SysCo’16), 619–635. DOI:https://doi.org/10.1109/SYSCO.2016.7831328Google Scholar
- M. S. Hossain. 2017. Cloud-supported cyber–physical localization framework for patients monitoring. IEEE Syst J. 11, 1 (2017), 118--127. DOI:10.1109/JSYST.2015.2470644Google Scholar
Cross Ref
- Usha Devi Gandhi, Priyan Malarvizhi Kumar, R. Varatharajan, Gunasekaran Manogaran, Revathi Sundarasekar, and Shreyas Kadu. 2018. HIoTPOT: Surveillance on IoT devices against recent threats. Wireless Pers. Commun. 103, 2 (2018), 1179–1194. DOI:https://doi.org/10.1007/s11277-018-5307-3 Google Scholar
Digital Library
- Quan Zou, Kaiyang Qu, Yamei Luo, Dehui Yin, Ying Ju, and Hua Tang. 2018. Predicting diabetes mellitus with machine learning techniques. Front. Genet. 9, (2018) 1–10. DOI:https://doi.org/10.3389/fgene.2018.00515Google Scholar
- V. Veena Vijayan and C. Anjali. 2016. Prediction and diagnosis of diabetes mellitus—A machine learning approach. In Proceedings of the 2015 IEEE Recent Advances in Intelligent Computational Systems (RAICS’15), 122–127. DOI:https://doi.org/10.1109/RAICS.2015.7488400Google Scholar
Cross Ref
- S. U. Amin et al. 2019. Cognitive smart healthcare for pathology detection and monitoring. IEEE Access. 7 (2019), 10745--10753. DOI:10.1109/ACCESS.2019.2891390Google Scholar
Cross Ref
- Khyati K. Gandhi and Nilesh B. Prajapati. 2014. Diabetes prediction using feature selection and classification. Int. J. Adv. Eng. Res. Dev 1, 05 (2014), 1–7. DOI:https://doi.org/10.21090/ijaerd.0105110Google Scholar
- Madhuri Panwar, Amit Acharyya, Rishad A. Shafik, and Dwaipayan Biswas. 2017. K-nearest neighbor based methodology for accurate diagnosis of diabetes mellitus. In Proceedings of the 2016 6th International Symposium on Embedded Computing and System Design (ISED’16), 132–136. DOI:https://doi.org/10.1109/ISED.2016.7977069Google Scholar
- K. Sowjanya, Ayush Singhal, and Chaitali Choudhary. 2015. MobDBTest: A machine learning based system for predicting diabetes risk using mobile devices. In Proceedings of the Souvenir 2015 IEEE International Advanced Computing Conference (IACC’15), 397–402. DOI:https://doi.org/10.1109/IADCC.2015.7154738Google Scholar
Cross Ref
- Emrana Kabir Hashi, Md Shahid Uz Zaman, and Md Rokibul Hasan. 2017. An expert clinical decision support system to predict disease using classification techniques. In Proceedings of the International Conference Electrical Computer and Communications Engineering ECCE 2017.(2017), 396–400. DOI:https://doi.org/10.1109/ECACE.2017.7912937Google Scholar
Cross Ref
- H. Balaji, N. Ch. S. N. Iyengar, and Ronnie D. Caytiles. 2017. Optimal predictive analytics of pima diabetics using deep learning. Int. J. Database Theory Appl.10, 9 (2017), 47–62. DOI:https://doi.org/10.14257/ijdta.2017.10.9.05Google Scholar
Cross Ref
- S. Srivastava, L. Sharma, V. Sharma, A. Kumar, A. and H. Darbari. 2019. Prediction of diabetes using artificial neural network approach. In Engineering Vibration, Communication and Information Processing. Springer, Singapore, 679–687.Google Scholar
- Sajida Perveen, Muhammad Shahbaz, Aziz Guergachi, and Karim Keshavjee. 2016. Performance analysis of data mining classification techniques to predict diabetes. Proc. Comput. Sci. 82, (2016) 115–121.Google Scholar
Cross Ref
- Ayush Anand and Divya Shakti. 2016. Prediction of diabetes based on personal lifestyle indicators. In Proceedings of the 2015 1st International Conference on Next Generation Computing Technologies (NGCT’15), 673–676. DOI:https://doi.org/10.1109/NGCT.2015.7375206Google Scholar
- Shivani Jakhmola and Tribikram Pradhan. 2015. A computational approach of data smoothening and prediction of diabetes dataset. ACM Intnational Conference Proceeding Series, 744–748. DOI:https://doi.org/10.1145/2791405.2791572 Google Scholar
Digital Library
- A. A. A. Jarullah. 2011. Decision tree discovery for the diagnosis of type II diabetes. In Proceedings of the 2011 International Conference on Innovations in Information Technology. IEEE.Google Scholar
Cross Ref
- Ahmed Hamza and Hani Moetque. 2017. Diabetes disease diagnosis method based on feature extraction using K-SVM. Int. J. Adv. Comput. Sci. Appl 8, 1 (2017), 236–244. DOI:https://doi.org/10.14569/ijacsa.2017.080130Google Scholar
- Mahmoud Heydari, Mehdi Teimouri, Zainabolhoda Heshmati, and Seyed Mohammad Alavinia. 2016. Comparison of various classification algorithms in the diagnosis of type 2 diabetes in Iran. Int. J. Diabetes Dev. Ctries. 36, 2 (2016), 167–173. DOI:https://doi.org/10.1007/s13410-015-0374-4Google Scholar
Cross Ref
- Messan Komi, Jun Li, Yongxin Zhai, and Zhang Xianguo. 2017. Application of data mining methods in diabetes prediction. In Proceedings of the 2nd International Conference on Image, Vision and Computing (ICIVC’17), 1006–1010.Google Scholar
Cross Ref
- A. Swain, S. N. Mohanty, and A. C. Das. 2016. Comparative risk analysis on prediction of diabetes mellitus using machine learning approach. In Proceedings of the 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT’16).Google Scholar
- N. Douali, J. Dollon, and M. Jaulent. 2015. Personalized prediction of gestational Diabetes using a clinical decision support system. In Proceedings of the IEEE International Conference on Fuzzy Systems (FUZZ-IEEE'15). 1--5. DOI:10.1109/FUZZ-IEEE.2015.7337813Google Scholar
Digital Library
- Nitin Bhatia and Sangeet Kumar. 2015. Prediction of severity of diabetes mellitus using fuzzy cognitive maps. Life Sci. Adv. Tech. 29 (2015), 71–79.Google Scholar
- Han Wu, Shengqi Yang, Zhangqin Huang, Jian He, and Xiaoyi Wang. 2018. Type 2 diabetes mellitus prediction model based on data mining. Informat. Med. Unlocked 10, (2018), 100–107.Google Scholar
Cross Ref
- Mehrbakhsh Nilashi, Othman bin Ibrahim, Hossein Ahmadi, and Leila Shahmoradi. 2017. An analytical method for diseases prediction using machine learning techniques. Comput. Chem. Eng. 106, (2017), 212–223.Google Scholar
- WDBC. Retrieved 2019 from https://datahub.io/machine-learning/wdbc.Google Scholar
- AdilHusain and Muneeb Khan. 2018. Early diabetes prediction using voting based ensemble learning. In Proceedings of the International Conference on Advances in Computing and Data Sciences, Springer, Singapore. 2018, 95–103.Google Scholar
- S. Rasoul Safavian and David Landgrebe. 1991. A survey of decision tree classifier methodology. IEEE Trans. Syst. Man Cybernet. 21, 3 (1991), 660–674. DOI:https://doi.org/10.1109/21.97458Google Scholar
Cross Ref
- Mohamed Ahmed Ahmed, Ahmet Rizaner, and Hakan Ulusoy Ali. 2018. A novel decision tree classification based on post-pruning with Bayes minimum risk. PLoS One 13, 4 (2018), 1–12. DOI:https://doi.org/10.1371/journal.pone.0194168Google Scholar
Cross Ref
- C. Cortes and V. Vapnik. 1995. Support-vector networks. Mach. Learn. 20, 3 (1995), 273–297. Google Scholar
Digital Library
- M. S. Hossain, S. U. Amin, M. Alsulaiman, and G. Muhammad. 2019. Applying deep learning for epilepsy seizure detection and brain mapping visualization. ACM Trans. Multimed. Comput. Commun. Appl 15, 1 (2019), 1--17. DOI:10.1145/3241056 Google Scholar
Digital Library
- S. U. Amin et al. 2019. Deep Learning for EEG motor imagery classification based on multi-layer CNNs feature fusion. Future Gener Comput Syst. 101 (2019), 542--554. DOI:10.1016/j.future.2019.06.027Google Scholar
Digital Library
- What Is Correlation. Retrieved 2019 from https://www.displayr.com/what-is-correlation/.Google Scholar
- Arwinder Dhillon, Ashima Singh 2019. Mach. Learn. Healthcare. 8, (July 2019), 92–109.Google Scholar
- Diseases Conditions. Retrieved 2019 from https://www.mayoclinic.org/diseases-conditions/diabetes/diagnosis-treatment/drc-20371451.Google Scholar
- Ensemble Learning to Improve Machine Learning Results. Retreived 2019 from https://blog.statsbot.co/ensemble-learning-d1dcd548e936.Google Scholar
- Gestational Diabetes and Pregnancy. Retrieved 2019 from https://www.cdc.gov/pregnancy/diabetes-gestational.html.Google Scholar
- How Does a Continuous Glucose Monitor Work? Retrieved 2019 from https://www.webmd.com/diabetes/guide/continuous-glucose-monitoring#1.Google Scholar
- Decision Tree Classification in Python. Retrieved 2020 from https://www.datacamp.com/community/tutorials/decision-tree-classification-python.Google Scholar
- Feature Selection Is Python—Recursive Feature Elimination. Retreived 2020 from https://towardsdatascience.com/feature-selection-in-python-recursive-feature-elimination-19f1c39b8d15.Google Scholar
- M. Chen, J. Yang, L. Hu, M. S. Hossain, and G. Muhammad. 2018. Urban Healthcare Big Data System Based on Crowdsourced and Cloud-Based Air Quality Indicators. IEEE Commun. Mag. 56, 11 (2018), 14--20. DOI:10.1109/MCOM.2018.1700571Google Scholar
Cross Ref
- Gagangeet Singh Aujla, Anish Jindal, Rajat Chaudhary, Neeraj Kumar, Sahil Vashist, Neeraj Sharma, and Mohammad S. Obaidat. 2019. DLRS: Deep learning-based recommender system for smart healthcare ecosystem. In Proceedings of the IEEE International Conference on Communications. DOI:https://doi.org/10.1109/ICC.2019.8761416Google Scholar
- Pratt. 2018. Anti-drug antibodies: emerging approaches to predict, reduce or reverse biotherapeutic immunogenicity. Antibodies 7, 2 (2018), 19. DOI:https://doi.org/10.1142/S0219720018500178Google Scholar
Cross Ref
- Arwinder Dhillon and Ashima Singh. 2020. eBreCaP: Extreme learning based model for breast cancer survival prediction. IET Sys. Biol. (2020), 12. DOI:https://doi.org/10.1049/iet-syb.2019.0087Google Scholar
- Parampreet Kaur, Ashima Singh, and Inderveer Chana, 2021. Computational techniques and tools for omics data analysis: State-of-the-art, challenges, and future directions. Arch. Computat. Methods Eng. (2021). DOI:https://doi.org/10.1007/s11831-021-09547-0Google Scholar
- G. Muhammad, M. S. Hossain, and N. Kumar. 2021. EEG-based pathology detection for home health monitoring. IEEE J. Sel. Areas Commun. 39, 2 (2021), 603--610. DOI:10.1109/JSAC.2020.3020654Google Scholar
Cross Ref
- Neha Sharma and Ashima Singh. 2018. Diabetes detection and prediction using machine learning/IoT: A survey. In Proceedings of the IEEE International Conference on Advanced Informatics for Computing Research, Springer, Singapore, (2018), 471–479. DOI:https://doi.org/10.1007/978-981-13-3140-4_42Google Scholar
- Thinking Before Building: XGBoost Parallelization. Retreived 2020 from https://medium.com/blablacar-tech/thinking-before-building-xgboost-parallelization-f1a3f37b6e68.Google Scholar
- Arwinder Dhillon, Ashima Singh, Harpreet Vohra, Caroline Ellis, Blesson Varghese, and Sukhpal Singh Gill. 2020. IoTPulse: Machine learning-based enterprise health information system to predict alcohol addiction in Punjab (India) using IoT and fog computing. Enter. Inform. Sys. (2020), 1–33. DOI:https://doi.org/10.1080/17517575.2020.1820583Google Scholar
- How XGBoost Works. Retreived 2020 from https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost-HowItWorks.html.Google Scholar
- PIMA INDIAN DIABETES. Retreived 2019 from https://www.kaggle.com/rnmehta5/pima-indian-diabetes-binary-classification.Google Scholar
- Emsemble Methods. Retreived 2020 from https://www.toptal.com/machine-learning/ensemble-methods-machine-learning.Google Scholar
Index Terms
eDiaPredict: An Ensemble-based Framework for Diabetes Prediction
Recommendations
Using Machine Learning Algorithms to Predict Diabetes Mellitus Based on PIMA Indians Diabetes Dataset
ICVARS 2021: 2021 the 5th International Conference on Virtual and Augmented Reality SimulationsCurrently, there are still a great amount of people suffering from diabetes mellitus (DM). Although advanced facilities and technologies could support the diagnosis of diabetes, complicated procedures are not supposed to be neglected. Actually, the ...
Machine Learning Based Unified Framework for Diabetes Prediction
BDET 2018: Proceedings of the 2018 International Conference on Big Data Engineering and TechnologyMachine learning gained a significant position in healthcare services (HCS) due to its ability to improve the disease prediction in HCS. Machine learning techniques and artificial intelligence have already been worked in the HCS area. Recently, diabetes ...
Prediction on diabetes patient's hospital readmission rates
ICAICR '19: Proceedings of the Third International Conference on Advanced Informatics for Computing ResearchHospital Readmission is considered as an effective measurement of service and care provided within the hospital. Emergency readmission to hospital is frequently used as a measure of the quality of a hospital because a high proportion of readmissions ...






Comments