Ensemble machine learning model for predicting used car prices in Kenya.

Moses, Onserio James

CUK REPOSITORY HOME
→
Master Theses and Dissertations (MST)
→
School of Computing and Mathematics (SCOM)
→
Department of Computing Science and Information Technology (DCSIT)
→
View Item

dc.contributor.author	Moses, Onserio James
dc.date.accessioned	2026-05-28T12:27:02Z
dc.date.available	2026-05-28T12:27:02Z
dc.date.issued	2025
dc.identifier.uri	https://repository.cuk.ac.ke/handle/123456789/1945
dc.description	A project submitted to the department of computer science and Information technology in the school of computing and Mathematics in partial fulfillment of the requirements for the Award of the degree of master of science in data science of the Co-operative university of Kenya.	en_US
dc.description.abstract	Most Kenyan car owners prefer used vehicles due to their affordability, leading to a booming used car market. However, the absence of an objective pricing mechanism has led to inconsistent and subjective pricing, with prices varying significantly from seller to seller. This research aimed to provide a data-driven solution by incorporating key vehicle attributes. Using Design Science Research (DSR) methodology, the research implemented machine learning techniques: Random Forest (RF), Support Vector Machines (SVM), K-Nearest Neighbors (KNN), Gradient Boosting, Linear regression as base models, and Permutation for feature explanation to enhance accuracy and interpretability. The individual models were trained and evaluated using 5 cross-validation. Random Forest emerged as the best with a Mean Absolute Error of 0.1174, and Linear regression was the last with a Mean Absolute Error of 0.2635. For performance optimization, the four best baseline models (RF, SVM, KNN, and GB) were combined using a Stacking Regressor, which achieved an R-squared score of 0.9725, a mean absolute error (MAE) of 0.1137, and a mean squared error (MSE) of 0.2171, showing an improved predictive performance compared to individual models. Feature importance analysis identified mileage, car age, annual insurance, engine size, and usage type (Kenyan/Foreign) as the most influential features. These findings are significant because they demonstrate that machine learning can provide an objective, reliable, and data-driven pricing mechanism for the Kenyan used car market. The results offer practical value to car buyers, sellers, dealerships, and insurance companies by reducing pricing disparities, improving transparency, and supporting more informed decision-making. The developed ensemble model can therefore be applied as a practical tool for accurate price estimation, contributing to fairness, efficiency, and better market regulation in Kenya’s used car industry.	en_US
dc.language.iso	en	en_US
dc.title	Ensemble machine learning model for predicting used car prices in Kenya.	en_US
dc.type	Thesis	en_US