Diagnostic Accuracy of Different Machine Learning Algorithms for Breast Cancer Risk Calculation: a Meta-Analysis

Document Type: Systematic Review and Meta-analysis

Authors

1 Doctoral Program, Faculty of Medicine, Public Health and Nursing, Universitas Gadjah Mada, Yogyakarta City, Indonesia.

2 Department of Public Health, Faculty of Medicine, Universitas Andalas, Padang City, Indonesia.

3 Department of Surgery, Faculty of Medicine, Public Health and Nursing, Universitas Gadjah Mada, Yogyakarta City, Indonesia.

4 Department of Health Policy and Management, Faculty of Medicine, Public Health and Nursing, Universitas Gadjah Mada, Yogyakarta City, Indonesia.

5 Department of Pharmacology and Therapy, Faculty of Medicine, Public Health and Nursing, Universitas Gadjah Mada, Yogyakarta City, Indonesia.

Abstract

Objective: The aim of this study was to determine the diagnostic accuracy of different machine learning algorithms
for breast cancer risk calculation. Methods: A meta-analysis was conducted of published research articles on diagnostic
test accuracy of different machine learning algorithms for breast cancer risk calculation published between January 2000
and May 2018 in the online article databases of PubMed, ProQuest and EBSCO. Paired forest plots were employed for
the analysis. Numerical values for sensitivity and specificity were obtained from false negative (FN), false positive (FP),
true negative (TN) and true positive (TP) rates, presented alongside graphical representations with boxes marking the
values and horizontal lines showing the confidence intervals (CIs). Summary receiver operating characteristic (SROC)
curves were applied to assess the performance of diagnostic tests. Data were processed using Review Manager 5.3
(RevMan 5.3). Results: A total of 1,879 articles were reviewed, of which 11 were selected for systematic review and
meta-analysis. Fve algorithms for machine learning able to predict breast cancer risk were identified: Super Vector
Machine (SVM); Artificial Neural Networks (ANN); Decision Tree (DT); Naive Bayes (NB); and K-Nearest Neighbor
(KNN). With the SVM, the Area Under Curve (AUC) from the SROC was > 90%, therefore classified into the excellent
category. Conclusion: The meta-analysis confirmed that the SVM algorithm is able to calculate breast cancer risk with
better accuracy value than other machine learning algorithms.

Keywords

Main Subjects