br In the present study
In the present study, we conducted a systematic analysis of cancer-associated changes in secretome expression to predict candidate biomarkers that could be significantly elevated in the biofluids of individuals with cancer and are therefore more likely to be detectable. We then investigated the patterns and biological functions associated with shifts in secretome expres-sion among different cancer types, focusing on shared ‘‘core’’ secretome behaviors, as well as cancer-specific features. The cancer secretome was explored in the context of tissue-specific genes, revealing a general pattern whereby tumor Conessine reduce their secretory pathway burden in an effort to relieve endo-plasmic reticulum (ER) stress and the associated unfolded protein response (UPR). We expect the resulting ranked lists of biomarkers for each of the 32 different cancer types, in addition to the insight gained from the functional analysis of the cancer secretome and associated modulation of the secretory pathway in cancer cells, to expedite the development of effective diagnostic biomarkers and illuminate potential strategies for improved anti-cancer therapies.
Evaluation of Secretome Biomarker Candidates
To focus on proteins that are intentionally and actively secreted from the cell, we defined the secretome as all proteins possess-ing an N-terminal signal peptide and annotated as having a sub-cellular location of ‘‘secreted’’ (UniProt; Bateman et al., 2017). This yielded a set of 1,816 secretome genes for evaluation. In
our investigation of cancer-specific secretome changes, we first sought to identify secretome genes whose encoded proteins were most likely to exhibit detectable changes in a biofluid as a result of their altered expression in a tumor. Our analysis pipe-line therefore involved the comparison of primary tumor tran-scriptomes with those of (1) paired-normal tissue, (2) healthy tis-sue corresponding to the cancer tissue of origin, and (3) all healthy tissues in the human body (Figure 1A). Primary tumor and paired-normal RNA-seq profiles were retrieved for 32 cancer types from The Cancer Genome Atlas (TCGA), whereas healthy tissue profiles were obtained from the Genotype-Tissue Expres-sion (GTEx) database (STAR Methods; Table S1).
Generation of a Consensus Score
To integrate information from the three comparisons performed, the results were combined to generate a consensus score for each gene in each cancer type. Top-ranked (high-scoring) genes for each cancer type were those with elevated expression in tu-mor samples compared to paired-normal tissue, healthy tissue of origin, and all healthy tissues. The complete set of consensus scores for all cancer types, as well as the fold changes (log2FC) and significance values (p values) used to determine the scores, are presented in Table S2.
Transcriptomic data of top-ranked genes were examined to confirm their distinct and elevated expression in tumor versus non-tumor samples. T-distributed stochastic neighbor embedding (t-SNE) was performed on tumor, paired-normal tis-sue, and healthy tissue transcript per million (TPM) values of the top 10 consensus-ranked genes for each cancer type (Figures 1B and S1). The majority of tumor samples exhibited clear clus-tering and separation from non-tumor samples, confirming distinct expression profiles between these groups among the highly ranked genes. The t-SNE plots also demonstrate differ-ences between paired-normal tissue and healthy tissue sam-ples, highlighting the importance of including both tumor versus paired-normal tissue and tumor versus healthy tissue compari-sons in the consensus rank. Although a difference in data sour-ces (TCGA versus GTEx) could contribute to the observed paired-normal tissue versus healthy tissue separation, a previ-ous analysis of the same two datasets found robust differences even after normalizing for potential batch effects (Aran et al., 2017), thus supporting a biological component.