Evaluation of risk factors and survival rates of patients with early-stage breast cancer with machine learning and traditional methods

Özgür, EG, Ulgen, A ORCID logoORCID: https://orcid.org/0000-0002-0872-667X, Uzun, S and Bekiroğlu, GN, 2024. Evaluation of risk factors and survival rates of patients with early-stage breast cancer with machine learning and traditional methods. International Journal of Medical Informatics, 190: 105548. ISSN 1386-5056

[thumbnail of 1915172_a2784_Ulgen.pdf] Text
1915172_a2784_Ulgen.pdf - Post-print
Full-text access embargoed until 11 July 2025.

Download (1MB)

Abstract

Background

This article is aimed to make predictions in terms of prognostic factors and compare prediction methods by using Cox proportional hazards regression analysis (CPH), some machine learning techniques and Accelerated Failure Time (AFT) model for post-treatment survival probabilities according to clinical presentations and pathological information of early-stage breast cancer patients.

Material and methods

The study was carried out in three stages. In the first stage, the CPH method was applied. In the second stage, the AFT model and in the last stage, machine learning methods were applied. The data set consists of 697 breast cancer patients who applied to Marmara University Hospital oncology clinic between 01.01.1994 and 31.12.2009. The models obtained by using various parameters of the patients were compared according to the C index, 5-year survival rate and 10-year survival rate.

Results and conclusion

According to the models obtained as a result of the analyses applied, MetLN and age were obtained as a significant risk factor as a result of CPH method and AFT methods, while MetLN, age, tumor size, LV1 and extracapsular involvement were obtained as risk factors in machine learning methods. In addition, when the c-index values of the handheld models are examined, it is obtained as 69.8 for the CPH model, 70.36 for the AFT model, 72.1 for the random survival forest and 72.8 for the gradient boosting machine. In conclusion, the study highlights the potential of comparing conventional statistical methods and machine-learning algorithms to improve the precision of risk factor determination in early-stage breast cancer prognosis. Additionally, efforts should be made to enhance the interpretability of machine-learning models, ensuring that the results obtained can be effectively communicated and utilized by clinical practitioners. This would enable more informed decision-making and personalized care in the treatment and follow-up processes for early-stage breast cancer patients.

Item Type: Journal article
Publication Title: International Journal of Medical Informatics
Creators: Özgür, E.G., Ulgen, A., Uzun, S. and Bekiroğlu, G.N.
Publisher: Elsevier
Date: October 2024
Volume: 190
ISSN: 1386-5056
Identifiers:
Number
Type
10.1016/j.ijmedinf.2024.105548
DOI
S1386505624002119
Publisher Item Identifier
1915172
Other
Rights: © 2024 Published by Elsevier B.V. This accepted manuscript is shared under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License.
Divisions: Schools > School of Science and Technology
Record created by: Melissa Cornwell
Date Added: 26 Jul 2024 07:44
Last Modified: 31 Jul 2024 08:43
URI: https://irep.ntu.ac.uk/id/eprint/51828

Actions (login required)

Edit View Edit View

Statistics

Views

Views per month over past year

Downloads

Downloads per month over past year