<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
<title>Department of Computing Science and Information Technology (DCSIT)</title>
<link href="https://repository.cuk.ac.ke/handle/123456789/617" rel="alternate"/>
<subtitle/>
<id>https://repository.cuk.ac.ke/handle/123456789/617</id>
<updated>2026-06-03T03:16:39Z</updated>
<dc:date>2026-06-03T03:16:39Z</dc:date>
<entry>
<title>Privacy preserving data governance in cross border Telemedicine using federated learning and differential Privacy in Kenya.</title>
<link href="https://repository.cuk.ac.ke/handle/123456789/1946" rel="alternate"/>
<author>
<name>Michael, Meyo Otieno</name>
</author>
<id>https://repository.cuk.ac.ke/handle/123456789/1946</id>
<updated>2026-05-28T12:43:45Z</updated>
<published>2025-01-01T00:00:00Z</published>
<summary type="text">Privacy preserving data governance in cross border Telemedicine using federated learning and differential Privacy in Kenya.
Michael, Meyo Otieno
This thesis presents an auditable, privacy-preserving learning workflow for Kenyan cross-border telemedicine. Hospitals train models locally and share only model signals, so raw EHRs remain in country. Using synthetic Synthea EHRs, 3,459 records are partitioned across seven hospitals in Kenya, Tanzania, and Uganda to compare a centralized baseline, federated learning (FL), and FL with client-side differential privacy (DP). Random Forests are trained per site; probability-level fusion forms a global prediction without parameter averaging. The threat model covers a black-box external adversary and an honest-but-curious coordinator. We quantify privacy risk with membership-inference AUC and a model-inversion attack, and we log ε, δ, clipping C, noise σ, model hashes, rounds, and attack scores in an ε-register for audit. FL improves utility while maintaining localization: accuracy rises from 0.616 to 0.682 and F1 from 0.706 to 0.772, with positive-class recall reaching 0.844. Adding DP at ε = 0.30 reduces model-inversion success from 0.696 (centralized) to 0.638 (FL+DP), an absolute drop of about 8.4%, with membership-inference AUC near 0.50 (≈ random). Utility remains tunable at the chosen privacy budget, for example accuracy near 0.530 and F1 near 0.593 at ε = 0.30. The originality is practical: DP-bounded FL is paired with an attacker simulator and an ε-register that turns privacy into an operational, auditable control aligned with Kenya’s Data Protection Act and GDPR transfer principles. The dataset is synthetic and not clinically validated for East African representativeness, so results indicate technical feasibility; a regulated hospital pilot is the next step.
A thesis submitted to the department of computer science and Information technology in the school of mathematics and computing in partial fulfillment of the requirements for the award of the degree of master of science in cyber security of the co-operative university of Kenya.
</summary>
<dc:date>2025-01-01T00:00:00Z</dc:date>
</entry>
<entry>
<title>Ensemble machine learning model for predicting used car prices in Kenya.</title>
<link href="https://repository.cuk.ac.ke/handle/123456789/1945" rel="alternate"/>
<author>
<name>Moses, Onserio James</name>
</author>
<id>https://repository.cuk.ac.ke/handle/123456789/1945</id>
<updated>2026-05-28T12:27:03Z</updated>
<published>2025-01-01T00:00:00Z</published>
<summary type="text">Ensemble machine learning model for predicting used car prices in Kenya.
Moses, Onserio James
Most Kenyan car owners prefer used vehicles due to their affordability, leading to a booming used car market. However, the absence of an objective pricing mechanism has led to inconsistent and subjective pricing, with prices varying significantly from seller to seller. This research aimed to provide a data-driven solution by incorporating key vehicle attributes. Using Design Science Research (DSR) methodology, the research implemented machine learning techniques: Random Forest (RF), Support Vector Machines (SVM), K-Nearest Neighbors (KNN), Gradient Boosting, Linear regression as base models, and Permutation for feature explanation to enhance accuracy and interpretability. The individual models were trained and evaluated using 5 cross-validation. Random Forest emerged as the best with a Mean Absolute Error of 0.1174, and Linear regression was the last with a Mean Absolute Error of 0.2635. For performance optimization, the four best baseline models (RF, SVM, KNN, and GB) were combined using a Stacking Regressor, which achieved an R-squared score of 0.9725, a mean absolute error (MAE) of 0.1137, and a mean squared error (MSE) of 0.2171, showing an improved predictive performance compared to individual models. Feature importance analysis identified mileage, car age, annual insurance, engine size, and usage type (Kenyan/Foreign) as the most influential features. These findings are significant because they demonstrate that machine learning can provide an objective, reliable, and data-driven pricing mechanism for the Kenyan used car market. The results offer practical value to car buyers, sellers, dealerships, and insurance companies by reducing pricing disparities, improving transparency, and supporting more informed decision-making. The developed ensemble model can therefore be applied as a practical tool for accurate price estimation, contributing to fairness, efficiency, and better market regulation in Kenya’s used car industry.
A project submitted to the department of computer science and Information technology in the school of computing and Mathematics in partial fulfillment of the requirements for the Award of the degree of master of science in data science of the Co-operative university of Kenya.
</summary>
<dc:date>2025-01-01T00:00:00Z</dc:date>
</entry>
<entry>
<title>A hybrid machine learning model for predicting diseases in coffee production.</title>
<link href="https://repository.cuk.ac.ke/handle/123456789/1944" rel="alternate"/>
<author>
<name>Munyao, Joseph Kioko</name>
</author>
<id>https://repository.cuk.ac.ke/handle/123456789/1944</id>
<updated>2026-05-28T09:51:16Z</updated>
<published>2025-01-01T00:00:00Z</published>
<summary type="text">A hybrid machine learning model for predicting diseases in coffee production.
Munyao, Joseph Kioko
In Kenya, coffee farming faced various challenges, including widespread pests and diseases that endangered both the quality and yield of coffee. Traditional farming mechanisms were unable to provide timely interventions, which led to economic losses for farmers. The main objective of this study was to develop a hybrid machine learning model for the accurate prediction of coffee diseases in Kenya. The developed hybrid model combined Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks to enhance the accuracy of disease prediction in coffee crops. CNN extracted spatial features from leaf images, while LSTM captured temporal patterns from environmental and agronomic data. This approach enabled early and precise detection of diseases in Kenyan coffee farms. By using this solution, coffee farmers were able to improve disease management, thereby optimizing coffee yields. The use of this technique was aimed at facilitating the early detection of major potential threats to coffee production in Kenya. The study further outlined the methodologies, outcomes, and long-term implications for local farming and the wider coffee industry in Kenya.
A project submitted to the department of computer science and I.t in the school of computing and mathematics in partial fulfillment of the requirements for the award of the degree of master of science in data science of the co-operative university of Kenya.
</summary>
<dc:date>2025-01-01T00:00:00Z</dc:date>
</entry>
<entry>
<title>A predictive intelligence model for early detection of non- communicable diseases: case study Kitui county.</title>
<link href="https://repository.cuk.ac.ke/handle/123456789/1943" rel="alternate"/>
<author>
<name>Chepngetich, Nicole</name>
</author>
<id>https://repository.cuk.ac.ke/handle/123456789/1943</id>
<updated>2026-05-28T09:28:39Z</updated>
<published>2025-01-01T00:00:00Z</published>
<summary type="text">A predictive intelligence model for early detection of non- communicable diseases: case study Kitui county.
Chepngetich, Nicole
Non-communicable diseases continue to claim the lives of the Kenyan population particularly in low-resource settings like Kitui. In this study, a predictive intelligence model was formulated based on clinical, demographic, and behavioral information of 68, 601 patients record to help in the identification of early NCD. Three machine learning models were trained and tested on 5-fold cross-validation namely; Logistic Regression, Random Forest, and XGBoost. Random Forest and XGBoost were more accurate (93% and 93%), than the Logistic Regression (74%). Another hybrid variant of soft-voting (Random Forest and XGBoost) further enhanced the balance of the classification, giving 0.93 accuracy, 0.81 precision, 0.82 recall and 0.81 F1-score. Among the most significant predictors, there were systolic blood pressure, BMI, and fasting blood sugar. SHAP analysis was more interpretable, as it showed the effect of the predictors on the individual risk scores. The results show that hybrid ML models are reliable to assist in early detection of NCD cases in resource-constrained environments.
A Project Submitted to the School of Mathematics and Computing in Partial Fulfillment of The Requirement of the Award of Master of Science in Data Science of The Cooperative University of Kenya
</summary>
<dc:date>2025-01-01T00:00:00Z</dc:date>
</entry>
</feed>
