| dc.description.abstract |
Most Kenyan car owners prefer used vehicles due to their affordability, leading to a booming used car market. However, the absence of an objective pricing mechanism has led to inconsistent and subjective pricing, with prices varying significantly from seller to seller. This research aimed to provide a data-driven solution by incorporating key vehicle attributes. Using Design Science Research (DSR) methodology, the research implemented machine learning techniques: Random Forest (RF), Support Vector Machines (SVM), K-Nearest Neighbors (KNN), Gradient Boosting, Linear regression as base models, and Permutation for feature explanation to enhance accuracy and interpretability. The individual models were trained and evaluated using 5 cross-validation. Random Forest emerged as the best with a Mean Absolute Error of 0.1174, and Linear regression was the last with a Mean Absolute Error of 0.2635. For performance optimization, the four best baseline models (RF, SVM, KNN, and GB) were combined using a Stacking Regressor, which achieved an R-squared score of 0.9725, a mean absolute error (MAE) of 0.1137, and a mean squared error (MSE) of 0.2171, showing an improved predictive performance compared to individual models. Feature importance analysis identified mileage, car age, annual insurance, engine size, and usage type (Kenyan/Foreign) as the most influential features. These findings are significant because they demonstrate that machine learning can provide an objective, reliable, and data-driven pricing mechanism for the Kenyan used car market. The results offer practical value to car buyers, sellers, dealerships, and insurance companies by reducing pricing disparities, improving transparency, and supporting more informed decision-making. The developed ensemble model can therefore be applied as a practical tool for accurate price estimation, contributing to fairness, efficiency, and better market regulation in Kenya’s used car industry. |
en_US |