Prediction of Diabetes Using Machine Learning Techniques

Authors

  • F Akinbohun Department of Computer Science, Rufus Giwa Polytechnic, Owo, Nigeria
  • IP Adegun Department of Computer Science, Federal University of Technology Akure, Nigeria

Keywords:

Diabetes Mellitus, Consistency-based Feature Selection, Correlation-based Feature Selection, Machine learning, prediction

Abstract

Diabetes Mellitus is a metabolic disorder that occurs when the body cannot produce sufficient insulin. Its prevalence has seen a significant surge worldwide, necessitating improved methods for early and accurate prediction. Machine learning techniques have proven to be effective in the prediction of diabetes. This study harnesses the capabilities of machine learning (ML) techniques to predict diabetes. To improve the learning efficiency and prediction performance, feature selection techniques were employed in the study. This process selects only optimal features that contributes the most to prediction variables from entire feature set. In this study, three machine learning algorithms (Support Vector Machine, random forest and decision tree) were applied on Pima Indians diabetes dataset. Consistency and correlation-based feature selection techniques were applied on the dataset to improve prediction performance and reduce dimensionality. The results from the experiments show that of all the three models that were used, there was a significant improvement in the performance of the models when feature selection techniques were used. For instance, Support Vector Machine had an accuracy of 81.74% before feature selection as opposed to the accuracy of 79.13% before its application. Random Forest also had an accuracy of 80.08% using Consistency feature selection method as opposed to an accuracy of 77.78% before its application.

Downloads

Published

2012-11-17