Leveraging Ensemble Models for Optimizing Predictive Accuracy of Low Birthweight Risk in Kenya.

Opiyo, Victor; Anyika, Emma

CUK REPOSITORY HOME
→
Research Papers
→
School of Computing and Mathematics (SCOM)
→
Department of Computing Science and Information Technology (DCSIT)
→
View Item

dc.contributor.author	Opiyo, Victor
dc.contributor.author	Anyika, Emma
dc.date.accessioned	2026-01-15T11:59:40Z
dc.date.available	2026-01-15T11:59:40Z
dc.date.issued	2025-10-27
dc.identifier.uri	https://doi.org/10.11648/j.ajai.20250902.22
dc.identifier.uri	https://repository.cuk.ac.ke/handle/123456789/1885
dc.description	A research article published in the American Journal of Artificial Intelligence.	en_US
dc.description.abstract	Low birth weight (LBW) is a prevalent public health challenge in low- and middle-income countries, including Kenya, where approximately 11.5% of newborns are affected. LBW is linked to heightened infant mortality, infections, and long-term developmental issues. While machine learning (ML), particularly ensemble learning, has demonstrated potential in improving LBW risk prediction, its application in resource-limited settings like Kenya remains underexplored. Prior research has largely focused on developed countries with limited adoption in sub-Saharan Africa, highlighting a crucial gap this study aims to address. This research develops and evaluates ensemble machine learning models to predict LBW risk using nationally representative data from the 2022 Kenya Demographic and Health Survey. The study integrates traditional clinical indicators with advanced computational methods, employing base classifiers such as Support Vector Machines and Logistic Regression alongside ensemble methods including Random Forest, Gradient Boosting, and Extreme Gradient Boosting. Meta-ensemble approaches such as bagging, voting, and stacking were also assessed. Data preprocessing included treatment of missing values, encoding categorical variables, and addressing class imbalance through the Synthetic Minority Over-sampling Technique (SMOTE). Models were trained and validated using stratified cross-validation and independent testing, with evaluation metrics comprising ROC AUC, accuracy, F1-score, Matthews Correlation Coefficient, and Brier score, emphasizing both discrimination and calibration. Results indicate that Random Forest outperformed other models, achieving a high ROC AUC of 0.957 and PR AUC of 0.971, with excellent calibration (Brier score 0.089), evidencing its strong predictive capability for LBW risk in the Kenyan context. Important predictors identified were gestational age, maternal height and weight, antenatal care utilization, and socioeconomic factors, consistent with known biological and contextual determinants. Ethical considerations regarding patient privacy, algorithmic fairness, and transparency were incorporated to promote responsible AI use in healthcare. The findings demonstrate that tailored ensemble learning models provide robust, interpretable, and practical tools for LBW prediction in low-resource settings. This work fills a critical research gap by applying advanced ML methods to Kenyan maternal-child health data, offering potential to enhance clinical decision-making and improve maternal and neonatal outcomes. The study underscores the importance of contextualized AI solutions and ethical governance for sustainable healthcare innovation.	en_US
dc.language.iso	en	en_US
dc.publisher	Science Publishing group.	en_US
dc.relation.ispartofseries	2025, Vol. 9, No. 2;pp. 217-228
dc.subject	Low Birth Weight.	en_US
dc.subject	Ensemble Learning.	en_US
dc.subject	Machine Learning.	en_US
dc.subject	Predictive Modelling.	en_US
dc.subject	Kenya.	en_US
dc.title	Leveraging Ensemble Models for Optimizing Predictive Accuracy of Low Birthweight Risk in Kenya.	en_US
dc.type	Article	en_US