2026/2/8
Hossein Sadeghi

Hossein Sadeghi

Academic rank: Professor
ORCID: https://orcid.org/0000-0002-8772-951X
Education: PhD.
H-Index:
Faculty: Science
ScholarId:
E-mail: h-sadeghi [at] araku.ac.ir
ScopusId: View
Phone:
ResearchGate:

Research

Title
Machine Learning for Early Prediction of Secondary Cancer After Radiotherapy
Type
JournalPaper
Keywords
Secondary cancer, Early prediction, Machine learning, Random forest regressor
Year
2025
Journal Scientific Reports
DOI
Researchers Hossein Sadeghi ، Fatemeh Seif

Abstract

Secondary cancers (SCs) following radiotherapy (RT) represent a significant long-term risk of cancer survivors, necessitating accurate predictive models for early intervention. This study developed a machine learning (ML) model integrating clinical, pathological, and genomic data to predict SC incidence. The model leverages a dataset of 1,240 patients from populationbased registries and clinical cohorts, incorporating features such as radiation dose, age at exposure, histology, and mutations (e.g., TP53, BRCA1/2). A Random Forest (RF) regression achieved perfect performance metrics (MSE = 0.002, R-squared = 0.98), with radiation dose (Gini importance = 0.42) and age at exposure (Gini importance = 0.38) identified as the most critical predictors. Predicted incidence rates for new patients, such as 15.2 per 10,000 for breastto- lung SCs, are consistent with epidemiological trends. The model’s impressive performance highlights its potential for accurately predicting SC, underscoring its utility in clinical settings for early detection and predictions for new patients. This study highlights the potential of ML in personalized oncology while emphasizing caution in interpreting overly optimistic metrics.