DSpace Repository

Leveraging ensemble models for optimizing predictive accuracy of low birthweight risk in Kenya.

Show simple item record

dc.contributor.author Otieno Opiyo, Victor
dc.date.accessioned 2026-07-01T09:13:41Z
dc.date.available 2026-07-01T09:13:41Z
dc.date.issued 2025
dc.identifier.uri https://repository.cuk.ac.ke/handle/123456789/1958
dc.description A project submitted to the department of computer science & Information technology in the school of computing and Mathematics in partial fulfilment of the requirements for the Award of the degree of master of science in data science of the Co-operative university of Kenya. en_US
dc.description.abstract Low birth weight (LBW) remains a significant public health concern in Kenya, affecting approximately 11.5% of infants and resulting in high infant mortality and long-term poor health. Accurate prediction of LBW risk is crucial to enable timely interventions and improve neonatal health outcomes. The objective of this study was to develop and evaluate ensemble machine learning models to predict the risk of LBW using nationally representative data from the Kenya Demographic and Health Survey (KDHS) 2022. A comprehensive preprocessing pipeline was used to handle missing values, encode categorical variables, and address class imbalance using the Synthetic Minority Over-Sampling Technique (SMOTE). Various machine learning methods, the base methods like Support Vector Machine and Logistic Regression, and ensemble models like the Random Forest, Gradient Boosting, and Extreme Gradient Boosting were all trained and compared. Moreover, the prediction abilities of meta-ensemble methods such as bagging, voting, and stacking classifiers are also evaluated. Model assessment was done using stratified cross-validation, and performance was evaluated on an independent test set using performance metrics such as ROC AUC, F1-score, and Brier score. Random Forest classifier achieved the highest score of 0.957 ROC AUC with decent calibration (Brier score of 0.089), being better than both base and meta-ensemble models. The key predictors identified from the analysis include gestational age, maternal anthropometrics (height, weight), and antenatal care attendance, which proved their biological and contextual applicability to LBW risk in Kenya. The paper highlights the significance of contextualized AI solutions and ethical governance in sustainable healthcare innovation. These results indicate that ensemble learning methods can be used with specific target population selection to achieve better results in LBW risk prediction in low-resource regions. Developing interpretable and stable models can guide clinical decision-making and focused interventions with the long-term objective of encouraging maternal and neonatal health outcomes in Kenya and other contexts. en_US
dc.language.iso en en_US
dc.publisher Cuk en_US
dc.title Leveraging ensemble models for optimizing predictive accuracy of low birthweight risk in Kenya. en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account