Comparison of Classification Success Rates of Different Machine Learning Algorithms in the Diagnosis of Breast Cancer

Ozcan, Irem; Aydin, Hakan; Cetinkaya, Ali

doi:10.31557/APJCP.2022.23.10.3287

Comparison of Classification Success Rates of Different Machine Learning Algorithms in the Diagnosis of Breast Cancer

Document Type : Research Articles

Authors

¹ Department of Computer Engineering, Faculty of Engineering and Architecture, Istanbul Gelisim University, Istanbul, Turkey.

² Department of Computer Engineering, Faculty of Engineering, Istanbul Topkapı University, Istanbul, Turkey.

³ Department of Electronics Technology, Istanbul Gelisim Vocational School, Istanbul Gelisim University, Istanbul, Turkey.

10.31557/APJCP.2022.23.10.3287

Abstract

Objective: To identify which Machine Learning (ML) algorithms are the most successful in predicting and diagnosing breast cancer according to accuracy rates. Methods: The “College of Wisconsin Breast Cancer Dataset”, which consists of 569 data and 30 features, was classified using Support Vector Machine (SVM), Naive Bayes (NB), Random Forest (RF), Decision Tree (DT), K-Nearest Neighbor (KNN), Logistic Regression (LR), Multilayer Perceptron (MLP), Linear Discriminant Analysis (LDA), XgBoost (XGB), Ada-Boost (ABC) and Gradient Boosting (GBC) ML algorithms. Before the classification process, the dataset was preprocessed. Sensitivity, accuracy, and definiteness metrics were used to measure the success of the methods. Result: Compared to other ML algorithms used in the study, the GBC ML algorithm was found to be the most successful method in the classification of tumors with an accuracy of 99.12%. The XGB ML algorithm was found to be the lowest method with an accuracy rate of 88.10%. In addition, it was determined that the general accuracy rates of the 11 ML algorithms used in the study varied between 88-95%.Conclusion: When the results obtained from the ML classifiers used in the study are evaluated, the efficiency of the GBC algorithm in the classification of tumors is obvious. It can be said that the success rates obtained from 11 different ML algorithms used in the study are valuable in terms of being used to predict different cancer types.

Keywords