MACHINE LEARNING BASED CERVICAL CANCER DETECTION MODEL IN WESTERN KENYA

MURERE, JOHN (2025)
xmlui.dri2xhtml.METS-1.0.item-type
Thesis

Cervical cancer is the leading cause of cancer-related deaths among Kenyan women, with approximately 3,200 deaths reported annually, driven mainly by low screening uptake (16%) and late diagnosis. The aim of this study was to develop a machine learning based model that would enhance the detection of cervical cancer in Western Kenya, a region that has limited healthcare resources. This study used a cross-sectional study design where data from 968 women were collected, including information on demographics, reproduction, and clinical characteristics. Data was collected from health facilities. The study showed that 93.7% (n = 907) had no biopsy-confirmed abnormalities, while 6.3% (n = 61) had abnormalities. There were five machine learning models (Logistic Regression, Random Forest, Decision Tree, Support Vector Machine, and Artificial Neural Network) that were trained on 70% of the data (training set) and tested on 30% of the data (testing set). The random forest model achieved the highest accuracy (94.33%) and specificity (98.37%), which outperformed the other models and traditional methods like Human papilloma virus (HPV) testing (70-80% specificity) and Pap smear (>90% specificity) for confirming negative cancer cases. The logistic regression model had the highest sensitivity of 70% which was comparable to the Pap-smear method (60-95% sensitivity), but it was lower than the HPV testing, with a sensitivity greater than 90% which makes it suitable for initial cervical cancer screening. The Pap smear results and use of hormonal contraceptives emerged as the key significant predictors of cervical cancer, which supports targeted screening strategies. The findings from this study confirmed there was a significant difference in model performance with partial superiority over existing methods and the influence of key cervical cancer risk factors. The combined approach of using a random forest model for confirmation and logistic regression for screening could optimize cervical cancer screening further in the resource-constrained Western setting. This study has underscored the potential that machine learning has in addressing cervical cancer disparities in Western Kenya, with implications for both public and private health interventions and future research work.

Mpiga chapa
University of Eldoret
Collections:

Preview

Jina:
FLAVIAN.pdf



Files in this item

Thumbnail
Thumbnail

The following license files are associated with this item:

Attribution-NonCommercial-NoDerivs 3.0 United States
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States