br We have now developed a novel framework
We have now developed a novel framework for the prediction of the prognosis of breast cancer patients (Fig. 1). We first examined all protein-coding genes for their relation to OS in breast cancer patients. We then validated 184 prognosis-related genes by meta-analysis of
Please cite this article as: H. Shimizu and K.I. Nakayama, A 23 gene–based molecular prognostic score precisely predicts overall survival of breast cancer pati..., EBioMedicine, https://doi.org/10.1016/j.ebiom.2019.07.046
Research in context
Evidence before this study
Cancer is the leading cause of death in developed countries, with methods to better stratify susceptible individuals being actively pursued. Recent technological advances have allowed us to de-velop various molecular prognostic indicators for cancer. How-ever, such indicators for breast cancer are not universal, given that they are restricted to specific platforms and subsets of pa-tients 2-NBDG on criteria such as hormone receptor, menopause and nodal status.
Added value of this study
We integrated statistical and artificial intelligence (AI)–based methods to develop mPS, a universal molecular prognostic score that is able to precisely predict overall survival (OS) and disease free survival of breast cancer patients on the basis of the binary ex-pression status of only 23 genes.
Implications of all the available evidence
We have revealed all OS-related genes for breast cancer, with these genes being potential drug targets. We also developed an AI-based prognosis-prediction score that is applicable to almost all subsets of breast cancer patients. We anticipate that this unbi-ased approach will not only facilitate appropriate treatment selec-tion for breast cancer patients but also provide molecular insight into the complex nature of this disease.
one of the largest breast cancer cohorts ever assembled. We next ap-plied artificial intelligence (AI)–based methods to develop mPS, a uni-versal molecular prognostic score that is able to precisely predict OS and disease free survival (DFS) of breast cancer patients on the basis of the binary expression status of only 23 genes. Unlike existing tools, mPS was found to be applicable to almost all breast cancer subtypes. We also show that mPS can stratify patients even at the same clinical stage, emphasizing the importance of the combination of mPS with con-ventional staging systems.
2. Materials and methods
2.1. Study design and cohorts
We performed a retrospective integrated analysis of 40 independent breast cancer cohorts, all published previously. The initial analysis was con-ducted with The Cancer Genome Atlas (TCGA) breast cancer cohort (discov-ery cohort) given that this is the best-characterized cohort available. We then performed a meta-analysis (random effects model) to validate the identified prognosis-related genes in a large combined multicenter valida-tion cohort consisting of 36 international breast cancer data sets (Supple-mentary Table S1) that include 5696 patients with early-stage (IA, IIA, IIB) breast cancer (Fig. 1, Step 1), as previously described .
For establishment of the molecular prognostic score (mPS), we adopted another breast cancer data set, the METABRIC (Molecular Tax-onomy of Breast Cancer International Consortium) breast cancer cohort [19,20]. We used half of the METABRIC cohort as the source of a training set (METABRIC training cohort) for AI-based machine learning (Fig. 1, Step 2) and neural network methods (Fig. 1, Step 3)  to develop mPS.
We then validated mPS with the other half of the METABRIC cohort (METABRIC test cohort). We also used two independent breast cancer cohorts (the microarray-based public data set GSE86166  and the RNA-sequencing–based ongoing data set GSE96058 ) for further validation of mPS.