Research Article Open Access

Improving Diabetes Risk Prediction Using Ensemble Boosting and SMOTE-Based Class Balancing

Kittipol Wisaeng1, Pankom Sriboonlue1 and Benchalak Muangmeesri2
  • 1 Department of Technology and Business Information System Unit, MSU Research Laboratory of Blockchain and Artificial Intelligence for Interdisciplinary Innovation, Mahasarakham Business School, Mahasarakham University, Mahasarakham, Thailand
  • 2 Department of Engineering Management, Suan Sunandha Rajabhat University, 1 U-Thong nok Road, Dusit, Bangkok 10300, Thailand

Abstract

Accurate diabetes prediction is vital for early intervention, optimized resource allocation, and minimizing long-term complications. This study presents a comparative evaluation of traditional and advanced machine learning models for diabetes classification using a structured clinical dataset. Seven baseline algorithms were assessed against five advanced ensemble methods: CatBoost, LightGBM, XGBoost, Voting Ensemble, and Stacking Ensemble. To improve algorithm learning, the Synthetic Minority Over-sampling Technique (SMOTE) and feature normalization were employed. The algorithm’s effectiveness was carefully evaluated using accuracy, precision, recall, and the F1 score. Results show that advanced models substantially outperformed traditional ones, with CatBoost achieving the highest F1 score of 0.7625. Feature importance analysis identified glucose, BMI, and age as the most influential indicators, consistent with clinical evidence. These findings demonstrate the potential of ensemble learning and boosting strategies for building interpretable, scalable, and effective diagnostic support tools in healthcare settings.

Journal of Computer Science
Volume 22 No. 1, 2026, 61-74

DOI: https://doi.org/10.3844/jcssp.2026.61.74

Submitted On: 17 June 2025 Published On: 1 February 2026

How to Cite: Wisaeng, K., Sriboonlue, P. & Muangmeesri, B. (2026). Improving Diabetes Risk Prediction Using Ensemble Boosting and SMOTE-Based Class Balancing. Journal of Computer Science, 22(1), 61-74. https://doi.org/10.3844/jcssp.2026.61.74

  • 20 Views
  • 2 Downloads
  • 0 Citations

Download

Keywords

  • Diabetes Prediction
  • Ensemble Learning
  • Voting Classifier
  • SMOTE