DSpace Repository

A Hybrid Machine Learning Model for Detecting and Preventing Corruption in Kenya’s Public Procurement Contracts

Show simple item record

dc.contributor.author Ndolo, Melchizedeck
dc.contributor.author Wanjoya, Anthony
dc.contributor.author Kasyoka, Philemon
dc.date.accessioned 2026-01-08T13:25:39Z
dc.date.available 2026-01-08T13:25:39Z
dc.date.issued 2025-10-10
dc.identifier.uri https://doi.org/10.11648/j.mlr.20251002.14
dc.identifier.uri https://repository.cuk.ac.ke/handle/123456789/1859
dc.description A research published in the Science Publishing group. en_US
dc.description.abstract Corruption in public procurement undermines fiscal sustainability, distorts competition, and reduces service quality. Conventional anti-corruption controls-manual audits, rule-based checks, and ex-post reviews-struggle to flag sophisticated, evolving fraud patterns in real time. This study proposes and empirically evaluates a hybrid machine-learning (ML) framework that integrates interpretable supervised models (logistic regression) with high-accuracy ensemble methods (random forest) and unsupervised learning (k-means clustering and anomaly detection) to identify corruption-prone contracts within Kenya’s public procurement ecosystem. Using secondary procurement data-contract values, procurement methods, bidder histories, award timelines-and text-derived indicators from public audit narratives, we construct features representing red flags such as single-bid tenders, repeated awards, and significant deviations from estimated costs. Logistic regression provides transparent coefficient-level evidence, while random forest captures non-linear interactions; clustering approximates high-risk groupings where labels are incomplete. Results indicate that single-bid tenders, prior supplier allegations, and execution irregularities (e.g., substandard deliveries, unusual extensions) are the most predictive factors of corruption labels. The ensemble achieved strong classification performance (AUC ≈ 0.98 on cross-validation), while the baseline logistic model offered high precision and policy-friendly interpretability. We outline a deployment roadmap for integrating the model into e-procurement workflows (IFMIS/PPRA) with explainable-AI (XAI) dashboards for risk-based audits. The contribution is twofold: a context-aware, reproducible pipeline for low- and middle-income settings, and governance guidance for embedding ML in accountability processes to prevent rather than merely detect procurement corruption. en_US
dc.language.iso en en_US
dc.publisher Science Publishing group. en_US
dc.relation.ispartofseries 2025, Vol. 10, No. 2;pp. 131-136
dc.subject Public Procurement. en_US
dc.subject Corruption Detection. en_US
dc.subject Machine Learning. en_US
dc.subject Cybersecurity. en_US
dc.subject Logistic Regression. en_US
dc.subject Anomaly Detection. en_US
dc.subject Explainable AI. en_US
dc.subject Kenya. en_US
dc.subject Random Forest. en_US
dc.title A Hybrid Machine Learning Model for Detecting and Preventing Corruption in Kenya’s Public Procurement Contracts en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account