Six identified immune subtypes span cancer tissue types and molecular subtypes
Immune subtypes differ by somatic aberrations, microenvironment, and survival
Multiple control modalities of molecular networks affect tumor-immune interactions
These analyses serve as a resource for exploring immunogenicity across cancer types
We performed an extensive immunogenomic analysis of more than 10,000 tumors comprising 33 diverse cancer types by utilizing data compiled by TCGA. Across cancer types, we identified six immune subtypes—wound healing, IFN-γ dominant, inflammatory, lymphocyte depleted, immunologically quiet, and TGF-β dominant—characterized by differences in macrophage or lymphocyte signatures, Th1:Th2 cell ratio, extent of intratumoral heterogeneity, aneuploidy, extent of neoantigen load, overall cell proliferation, expression of immunomodulatory genes, and prognosis. Specific driver mutations correlated with lower (CTNNB1, NRAS, or IDH1) or higher (BRAF, TP53, or CASP8) leukocyte levels across all cancers. Multiple control modalities of the intracellular and extracellular networks (transcription, microRNAs, copy number, and epigenetic processes) were involved in tumor-immune cell interactions, both across and within immune subtypes. Our immunogenomics pipeline to characterize these heterogeneous tumors and the resulting data are intended to serve as a resource for future targeted studies to further advance the field.
The Cancer Genome Atlas (TCGA) has profoundly illuminated the genomic landscape of human malignancy. Genomic and transcriptomic data derived from bulk tumor samples have been used to study the tumor microenvironment (TME), and measures of immune infiltration define molecular subtypes of ovarian, melanoma, and pancreatic cancer (Bailey et al., 2016, The Cancer Genome Atlas Network, 2015, The Cancer Genome Atlas Research Network, 2011) and immune gene expression in other tumors varies by molecular subtype (Iglesia et al., 2016). Characterization of the immune microenvironment using gene expression signatures, T cell receptor (TCR), and B cell receptor (BCR) repertoire, and analyses to identify neo-antigenic immune targets provide a wealth of information in many cancer types and have prognostic value (Bindea et al., 2013, Brown et al., 2014, Brown et al., 2015, Charoentong et al., 2017, Gentles et al., 2015, Iglesia et al., 2016, Li et al., 2016, Porta-Pardo and Godzik, 2016, Rooney et al., 2015).
Contemporaneous with the work of TCGA, cancer immunotherapy has revolutionized cancer care. Antibodies against CTLA-4, PD-1, and PD-L1 are effective in treating a variety of malignancies. However, the biology of the immune microenvironment driving these responses is incompletely understood (Hugo et al., 2016, McGranahan et al., 2016) but is critical to the design of immunotherapy treatment strategies.
We integrated major immunogenomics methods to characterize the immune tumor microenvironment (TME) across 33 cancers analyzed by TCGA, applying methods for the assessment of total lymphocytic infiltrate (from genomic and H&E image data), immune cell fractions from deconvolution analysis of mRNA-seq data, immune gene expression signatures, neoantigen prediction, TCR and BCR repertoire inference, viral RNA expression, and somatic DNA alterations (Table S1). Transcriptional regulatory networks and extracellular communication networks that may govern the TME were found, as were possible germline determinants of TME features, and prognostic models were developed.
Through this approach, we identified and characterized six immune subtypes spanning multiple tumor types, with potential therapeutic and prognostic implications for cancer management. All data and results are provided in Supplemental Tables, at the NCI Genomic Data Commons (GDC, ), and though the Cancer Research Institute iAtlas portal for interactive exploration and visualization (), and are intended to serve as a resource for future studies in the field of immunogenomics.
To characterize the immune response to cancer in all TCGA tumor samples, identify common immune subtypes, and evaluate whether tumor-extrinsic features can predict outcomes, we analyzed the TME across the landscape of all TCGA tumor samples. First, source datasets from all 33 TCGA cancer types and six molecular platforms (mRNA, microRNA, and exome sequencing; DNA methylation-, copy number-, and reverse-phase protein arrays) were harmonized by the PanCanAtlas consortium for uniform quality control, batch effect correction, normalization, mutation calling, and curation of survival data (Ellrott et al., 2018, Liu et al., 2018). We then performed a series of analyses, which we summarize here and describe in detail in the ensuing manuscript sections as noted within parentheses. We first compiled published tumor immune expression signatures and scored these across all non-hematologic TCGA cancer types. Meta-analysis of subsequent cluster analysis identified characteristic immunooncologic gene signatures, which were then used to cluster TCGA tumor types into six groups, or subtypes (described in Immune Subtypes in Cancer). Leukocyte proportion and cell type were then defined from DNA methylation, mRNA, and image analysis (see Composition of the Tumor Immune Infiltrate). Survival modeling was performed to assess how immune subtypes associate with patient prognosis (see Prognostic Associations of Tumor Immune Response Measures). Neoantigen prediction and viral RNA expression (see Survey of Immunogenicity), TCR and BCR repertoire inference (see The Adaptive Immune Receptor Repertoire in Cancer), and immunomodulator (IM) expression and regulation (see Regulation of Immunomodulators) were characterized in the context of TCGA tumor types, TCGA-defined molecular subtypes, and these six immune subtypes, so as to assess the relationship between factors affecting immunogenicity and immune infiltrate. In order to assess the degree to which specific underlying somatic alterations (pathways, copy-number alterations, and driver mutations) may drive the composition of the TME, we identified which alterations correlate with modified immune infiltrate (see Immune Response Correlates of Somatic Variation). We likewise asked whether gender and ancestry predispose individuals to particular tumor immune responses (see Immune Response Correlates of Demographic and Germline Variation). Finally, we sought to identify the underlying intracellular regulatory networks governing the immune response to tumors, as well as the extracellular communication networks involved in establishing the particular immune milieu of the TME (see Networks Modulating Tumoral Immune Response).
Immune Subtypes in Cancer
To characterize intratumoral immune states, we scored 160 immune expression signatures and used cluster analysis to identify modules of immune signature sets (Figure 1A, top). Five immune expression signatures—macrophages/monocytes (Beck et al., 2009), overall lymphocyte infiltration (dominated by T and B cells) (Calabro et al., 2009), TGF-β response (Teschendorff et al., 2010), IFN-γ response (Wolf et al., 2014), and wound healing (Chang et al., 2004)—which robustly reproduced co-clustering of these immune signature sets, were selected to perform cluster analysis of all 30 non-hematologic cancer types (Figures 1A middle, and S1A). The six resulting clusters “Immune Subtypes,” C1–C6 (with 2,416, 2,591, 2,397, 1,157, 385, and 180 cases, respectively) were characterized by a distinct distribution of scores over the five representative signatures (Figure 1A, bottom) and showed distinct immune signatures based on the dominant sample characteristics of their tumor samples (Figures 1B and 1C). Immune subtypes spanned anatomical location and tumor type, while individual tumor types and TCGA subtypes (Figures 1D and S1B–S1D) varied substantially in their proportion of immune subtypes.
C1 (wound healing) had elevated expression of angiogenic genes, a high proliferation rate (Figure 1C), and a Th2 cell bias to the adaptive immune infiltrate. Colorectal cancer (COAD [colon adenocarcinoma], READ [rectum adenocarcinoma]) and lung squamous cell carcinoma (LUSC) were rich in C1, as were breast invasive carcinoma (BRCA) luminal A (Figures S1C and S1D), head and neck squamous cell carcinoma (HNSC) classical, and the chromosomally unstable (CIN) gastrointestinal subtype.
C2 (IFN-γ dominant) had the highest M1/M2 macrophage polarization (Figure S2A, mean ratio = 0.52, p < 10−149, Wilcoxon test relative to next-highest), a strong CD8 signal and, together with C6, the greatest TCR diversity. C2 also showed a high proliferation rate, which may override an evolving type I immune response, and was comprised of highly mutated BRCA, gastric, ovarian (OV), HNSC, and cervical tumors (CESC).
C3 (inflammatory) was defined by elevated Th17 and Th1 genes (Figure 1C, both p < 10−23), low to moderate tumor cell proliferation, and, along with C5, lower levels of aneuploidy and overall somatic copy number alterations than the other subtypes. C3 was enriched in most kidney, prostate adenocarcinoma (PRAD), pancreatic adenocarcinoma (PAAD), and papillary thyroid carcinomas (THCA).
C4 (lymphocyte depleted) was enriched in particular subtypes of adrenocortical carcinoma (ACC), pheochromocytoma and paraganglioma (PCPG), liver hepatocellular carcinoma (LIHC), and gliomas, and displayed a more prominent macrophage signature (Figure 2A), with Th1 suppressed and a high M2 response (Figure S2A).
C5 (immunologically quiet), consisted mostly of brain lower-grade gliomas (LGG) (Figures 1D and S1B), exhibited the lowest lymphocyte (p < 10−17) and highest macrophage (p < 10−7) responses (Figure 2A), dominated by M2 macrophages (Figure S2A). Glioma subtypes (Ceccarelli et al., 2016) CpG island methylator phenotype-high (CIMP-H), the 1p/19q codeletion subtype and pilocytic astrocytoma-like (PA-like) were prevalent in C5, with remaining subtypes enriched in C4. IDH mutations were enriched in C5 over C4 (80% of IDH mutations, p < 2 × 10−16, Fisher’s exact test), suggesting an association of IDH mutations with favorable immune composition. Indeed, IDH mutations associate with TME composition (Venteicher et al., 2017) and decrease leukocyte chemotaxis, leading to fewer tumor-associated immune cells and better outcome (Amankulor et al., 2017).
Finally, C6 (TGF-β dominant), which was a small group of mixed tumors not dominant in any one TCGA subtype, displayed the highest TGF-β signature (p < 10−34) and a high lymphocytic infiltrate with an even distribution of type I and type II T cells.
These six categories represent features of the TME that largely cut across traditional cancer classifications to create groupings and suggest certain treatment approaches may be independent of histologic type. For a complete list of the TCGA cancer type abbreviations, please see .
Composition of the Tumor Immune Infiltrate
Leukocyte fraction (LF) varied substantially across immune subtypes (Figure 1C) and tumor types (Figure 2B). Tumors within the top third LF included cancers most responsive to immune checkpoint inhibitors, such as lung adenocarcinoma (LUAD), LUSC, cutaneous melanoma (SKCM), HNSC, and kidney renal clear cell carcinoma (KIRC), and in particular, the LUSC.secretory, LUAD.6, bladder urothelial carcinoma (BLCA.4), kidney renal papillary cell carcinoma (KIRP.C2a), and HNSCC mesenchymal subtypes. Uveal melanoma (UVM) and ACC had very low LF. Glioma subtypes displayed a greater range in LF than other tumors, which may reflect the presence or absence of microglia.
The leukocyte proportion of tumor stromal fraction, ρ, varied across tumor types and immune subtypes (Figures 2C and S2B), ranging from >90% in SKCM to <10% in stroma-rich tumors such as PAAD, PRAD, and LGG. Some tumors, e.g., BRCA, showed variation within annotated or immune subtypes. In BRCA, C1 has the lowest ρ (ρC1 = 0.44) while ρC2 = 0.61 was 37% higher (p < 0.001) (Figure S2B), and there were likewise differences between luminal A and basal BRCA (ρLumA = 0.45 and ρBasal = 0.67 [p < 0.001]). For LGG, ρC5 = 0.28 (p < 0.001), whereas ρC3 = 0.48 and ρC4 = 0.50 (p < 0.001) (Figure S2B), and in READ, ρCIN = 0.40 and ρMSI = 0.78 (p < 0.001).
The spatial fraction of tumor regions with tumor-infiltrating lymphocytes (TILs), estimated by analysis of digitized TCGA H&E-stained slides (Saltz et al., 2018), varied by immune subtype, with C2 the highest (p < 10−16, Figure 2D). Image estimates correlated modestly with molecular estimates of lymphocyte proportion (Figures S2C and S2D), in part because the molecular estimate is more similar to cell count, while spatial TIL is a fraction of the area. The relative similarity of the estimates of lymphocytic content between two radically different methodologies reinforces the robustness of individual methods.
Prognostic Associations of Tumor Immune Response Measures
Immune subtypes associated with overall survival (OS) and progression-free interval (PFI) (Figures 3A and S3A). C3 had the best prognosis (OS HR 0.628, p = 2.34 × 10−8 relative to C1, adjusted for tumor type), while C2 and C1 had less favorable outcomes despite having a substantial immune component. The more mixed-signature subtypes, C4 and C6, had the least favorable outcome. Functional orientation of the TME for tumor and immune subtypes was measured using the concordance index (CI) (Pencina and D’Agostino, 2004) and found to have context-dependent prognostic impact (Figures 3B, 3C and S3B). Higher lymphocyte signature associated with improved outcome in C1 and C2. An increased value of any of the five signatures led to worse outcome in C3 (Figure 3B), perhaps reflecting a balanced immune response. While increased Th17 cells generally led to improved OS, Th1 associated with worse OS across most immune subtypes, and Th2 orientation had mixed effects (Figure 3C). Tumor types displayed two behaviors relative to immune orientation (Figures 3B, OS; S3B, PFI). In the first group including SKCM and CESC, activation of immune pathways was generally associated with better outcome, while in the other, the opposite was seen. The relative abundance of individual immune cell types had complex associations that differed between tumor types (Figures S3C and S3D). These analyses extend beyond mere determination of lymphocyte presence to suggest testable properties that correlate with patient outcome in different tumor types and immune contexts.
We obtained and validated a survival model using elastic-net Cox proportional hazards (CoxPH) modeling with cross-validation. Low- and high-score tumors displayed significant survival differences in the validation set (Figure 3D), with good prediction accuracy (Figure 3E). Incorporating immune features into Cox models fit with tumor type, stage, and tumor type + stage (Figure 3F) improved predictive accuracy, highlighting the importance of the immune TME in determining survival. Lymphocyte expression signature, high number of unique TCR clonotypes, cytokines made by activated Th1 and Th17 cells, and M1 macrophages most strongly associated with improved OS (Figure S3E), while wound healing, macrophage regulation, and TGF-β associated with worse OS, recapitulating survival associations in immune subtypes. Within tumor types, the prognostic implications of immune subtypes seen in univariate analyses were largely maintained, with C3 correlating with better OS in six tumor types and C4 with poor OS in three cancer types (Figure S3F).
Immune Response Correlates of Somatic Variation
The immune infiltrate was related to measures of DNA damage, including copy number variation (CNV) burden (both in terms of number of segments and fraction of genome alterations), aneuploidy, loss of heterozygosity (LOH), homologous recombination deficiency (HRD), and intratumor heterogeneity (ITH) (Figure 4A). LF correlated negatively with CNV segment burden, with strongest correlation in C6 and C2, and positively with aneuploidy, LOH, HRD, and mutation load, particularly in C3. These results suggest a differential effect of multiple smaller, focal copy number events versus larger events on immune infiltration in certain immune subtypes.
Specific SCNAs affected LF and immune composition (Figures 4B and S4A). Chromosome 1p (including TNFRS9 and VTCN1) amplification associated with higher LF, while its deletion did the opposite. 19q deletion (including TGFB1) also correlated with lower LF, consistent with the role of TGF-β in immune cell recruitment (Bierie and Moses, 2010). Amplification of chr2, 20q, and 22q (including CTLA4, CD40, and ADORA2, respectively), and deletions of 5q, 9p, and chr19 (including IL13 and IL4, IFNA1 and IFNA2, and ICAM1, respectively) associated with changes in macrophage polarity (Figure S4A). IL-13 influences macrophage polarization (Mantovani et al., 2005), implying a possible basis for our observation that IL-13 deletions associated with altered M0 macrophage fractions.
Increased ITH associates with worse clinical outcomes or lower efficacy of IM therapy in a number of cancer types (McGranahan et al., 2016, Morris et al., 2016). ITH correlated (Spearman, Benjamini-Hochberg [BH]-adjusted p < 0.05) with total LF in nine tumor types (LUAD, BRCA, KIRC, HNSC, GBM [glioblastoma multiforme], OV, BLCA, SKCM, and READ; data not shown) and with individual relative immune cell fractions in many tumor types (Figure S4B). ITH was highest in C1 and C2 (p < 10−229 relative to all others) and lowest in C3 (p = 3 × 10−5, Figure 1C), possibly supporting the link between lower ITH and improved survival.
We correlated mutations in 299 cancer driver genes with immune subtypes and found 33 significant associations (q < 0.1) (Figure 4C, Table S2). C1 was enriched in mutations in driver genes, such as TP53, PIK3CA, PTEN, or KRAS. C2 was enriched in many of these genes, as well as HLA-A and B and CASP8, which could be immune-evading mechanisms (Rooney et al., 2015). C3 was enriched in BRAF, CDH1, and PBRM1 mutations, a finding of note since patients with PBRM1 mutations respond particularly well to IM therapy (Miao et al., 2018). C4 was enriched in CTNNB1, EGFR, and IDH1 mutations. C5 was enriched in IDH1, ATRX, and CIC, consistent with its predominance of LGG samples. C6 was only enriched in KRAS G12 mutations. Mutations in 23 driver genes associated with increased LF either in specific tumor types or across them, including TP53, HLA-B, BRAF, PTEN, NF1, APC, and CASP8. Twelve other events were associated with lower LF, including the IDH1 R132H mutation, GATA3, KRAS, NRAS, CTNNB1, and NOTCH1 (Figure 4D).
Since driver mutations in the same pathway had opposing correlations with LF (e.g., BRAF, KRAS, NRAS), we considered the overall effect of somatic alterations (mutations and SCNAs) on eight oncogenic signaling pathways. PI3K, NOTCH, and RTK/RAS pathway disruptions showed variable, tumor type-specific effects on immune factors, while TGF-β pathway disruptions more consistently associated with lower LF (most prominently in C2 and C6; Figure S4C), higher eosinophils (C2), and increased macrophages. However, in C3, TGF-β pathway disruption associated with higher LF and M1 macrophages and lower memory B cells, helper T cells, and M0 macrophages. Thus, TGF-β pathway disruption has context-dependent effects on LF but may promote increased macrophages, particularly M1. Higher M1/M2 ratio, in turn, may reiterate the local pro-inflammatory state in these patients.
Immune Response Correlates of Demographic and Germline Variation
Immune cell content and expression of PD-L1 varied by gender and genetic ancestry (Figures 4E and S4D). PD-L1 expression was greater (p < 0.05, Kruskal-Wallis test, unadjusted) in women than in men in HNSC, KIRC, LUAD, THCA, and KIRP (Figure S4E), while mesothelioma (MESO) showed an opposite trend. PD-L1 expression was lower in individuals of predicted African ancestry (overall p = 5 × 10−6). This association was consistent across most cancer types and was significant (p < 0.05, unadjusted) in BRCA, COAD, HNSC (Figure S4F), and THCA. No single cis-eQTL significantly correlated with PD-L1 expression, although the SNP rs822337, approximately 1 kb upstream of CD274 transcription start, correlated weakly (p = 0.074; 1.3 × 10−4 unadjusted; Figure S4G). Lymphocyte fractions tended to be lower in people of Asian ancestry, particularly in UCEC (uterine corpus endometrial carcinoma) and BLCA (Figure S4H). The significance of these demographic associations remains unclear but provides hypotheses for the efficacy of checkpoint inhibitor therapy based on genetic ancestry.
Survey of Immunogenicity
Peptides predicted to bind with MHC proteins (pMHCs) and induce antitumor adaptive immunity were identified from SNV and indel mutations. The number of pMHCs (neoantigen load) varied between immune subtypes (Figure 1C), correlated positively with LF in most immune subtypes (Figure S4I), and trended positive in most TCGA tumor subtypes, with some negative correlation seen among GI subtypes, and differential trending seen among individual LUAD, LUSC, OV, and KIRP subtypes (Figure S4J). Neoantigen load also associated with higher content of CD8 T cells, M1 macrophages, and CD4 memory T cells, and lower Treg, mast, dendritic, and memory B cells in multiple tumor types (Figure S4K).
Most SNV-derived peptides which bind to MHC were each found in the context of a single MHC allele (89.9%). Single mutations generate 99.8% of unique pMHCs while 0.2% result from distinct mutations in different genetic loci yielding identical peptides (Figure 5A). The most frequently observed pMHCs were from recurrently mutated genes (BRAF, IDH1, KRAS, and PIKC3A for SNVs, TP53 and RNF43 for indels) (Figure 5B, Tables S3 and S4). In BRCA and LIHC, worse PFI was associated with higher neoantigen load, while BLCA and UCEC showed the opposite effect (Figure S5A). For most tumors, however, there were no clear associations between predicted pMHC count and survival. Within immune subtypes (Figure S5B), higher neoantigen load was associated with improved PFI in C1 and C2 and worse PFI in C3, C4, and C5. These results suggest that neoantigen load provides more prognostic information within immune subtypes than based on tissue of origin, emphasizing the importance of overall immune signaling in responding to tumor neoantigens.
Cancer testis antigens (CTA) overall expression, and that of individual CTAs, varied by immune subtype with C5 having the highest (p < 10−13) and C3 the lowest (p = 10−4) expression values (Figure 1C). CEP55, TTK, and PBK were broadly expressed across immune subtypes, with enrichment in C1 and C2. C5 demonstrated high expression of multiple CTAs, illustrating that CTA expression alone is insufficient to elicit an intratumoral immune response.
We found human papilloma virus (HPV) in 6.2% of cases, mainly in CESC, GBM, HNSC, and KIRC, whereas hepatitis B virus (HBV) and Epstein-Barr virus (EBV) were mainly found in LIHC and STAD (stomach adenocarcinoma), respectively. In a regression model of all tumors, high load of each virus type associated with immune features (Figure S5C, cancer-type adjusted). High EBV content associated strongly with high CTLA4 and CD274 expression and low B cell signatures. High HPV levels associated with increased proliferation and Th2 cells but low macrophage content. In contrast, high HBV levels associated with Th17 signal and γδ T cell content. These findings highlight the diverse effect of different viruses on the immune response in different cancer types.
Our findings suggest that pMHC burden and viral content impact immune cell composition, while CTAs have inconsistent effects on the immune response. Moreover, the effect of pMHC load on prognosis is disease specific and influenced by immune subtype.
The Adaptive Immune Receptor Repertoire in Cancer
Antigen-specific TCR and BCR repertoires are critical for recognition of pathogens and malignant cells and may reflect a robust anti-tumor response comprising a large number of antigen-specific adaptive immune cells that have undergone clonal expansion and effector differentiation.
We evaluated TCR α and β and immunoglobulin heavy and light chain repertoires from RNA-seq. Mean TCR diversity values differed by immune subtype, with the highest diversity in C6 and C2 (p < 10−183, Wilcoxon, relative to all other subtypes; Figure 5C) and by tumor type (Figure S5D, lower panel). We saw recurrent TCR sequences across multiple samples (Figure S5E, Table S5), suggesting a common, but not necessarily cancer-related, antigen (the top recurrent TCRs include known mucosal associated invariant T cell sequences). We assessed co-occurrence of complementarity determining region 3 (CDR3) α and β chains, in order to determine the frequency of patients with identical TCRs (a surrogate marker for shared T cell responses). We identified 2,812 α-β pairs present in at least 2 tumors (p ≤ 0.05, Fisher’s exact test with Bonferroni correction; Figure 5D and Table S5). Likewise, testing for co-occurrence of specific SNV pMHC-CDR3 pairs across all patients identified 206 pMHC-CDR3 α pairs and 196 pMHC-CDR3 β pairs (Figure 5E, Table S5). Thus, a minority of these patients appear to share T cell responses, possibly mediated by public antigens. That said, there is relatively little pMHC and TCR sharing among tumors, highlighting the large degree of diversity in TILs.
Higher TCR diversity only correlated with improved PFI in a few tumor types (BLCA, COAD, LIHC, and UCEC) (Figure S5F). Therefore, it may be more important for the immune system to mount a robust response against only a few antigens, than a diverse response against many different antigens.
The pattern of immunoglobulin heavy chain diversity was similar to that of TCR diversity (Figures 5C and S5D), with tumors showing significant variance in IgH repertoire diversity, suggesting differential B cell recruitment and/or clonal expansion within the tumor types.
Regulation of Immunomodulators
IMs are critical for cancer immunotherapy with numerous IM agonists and antagonists being evaluated in clinical oncology (Tang et al., 2018). To advance this research, understanding of their expression and modes of control in different states of the TME is needed. We examined IM gene expression, SCNAs, and expression control via epigenetic and miRNA mechanisms.
Gene expression of IMs (Table S6, Figure 6A) varied across immune subtypes, and IM expression largely segregated tumors by immune subtypes (Figure S6A), perhaps indicative of their role in shaping the TME. Genes with the greatest differences between subtypes (Figures 6B and S6B) included CXCL10 (BH-adjusted p < 10−5), most highly expressed in C2 (consistent with its known interferon inducibility) and EDNRB (BH-adjusted p < 10−5), most highly expressed in the immunologically quiet C5. DNA methylation of many IM genes, e.g., CD40 (Figure 6C), IL10, and IDO1, inversely correlated with gene expression, suggesting epigenetic silencing. 294 miRNAs were implicated as possible regulators of IM gene expression; among these, several associated with IMs in multiple subtypes (Figure S6C) including immune inhibitors (EDNRB, PD-L1, and VEGFA) and activators (CD28 and TNFRSF9). The immune activator BTN3A1 was one of the most commonly co-regulated IMs from the SYGNAL-PanImmune network (below). Negative correlations between miR-17 and BTN3A1, PDCD1LG2, and CD274 may relate to the role of this miRNA in maturation and activation of cells into effector or memory subsets (Liang et al., 2015).
Copy-number alterations affected multiple IMs and varied across immune subtypes. C1 and C2 showed both frequent amplification and deletion of IM genes, consistent with their greater genomic instability, while subtypes C3 and C5 generally showed fewer alterations in IM genes. In particular, IMs SLAMF7, SELP, TNFSF4 (OX40L), IL10, and CD40 were amplified less frequently in C5 relative to all samples, while TGFB1, KIR2DL1, and KIR2DL3 deletions were enriched in C5 (Figure 6D), consistent with our observation of lower immune infiltration with TGFB1 deletion (Figure S4A). CD40 was most frequently amplified in C1 (Figure 6D) (Fisher’s exact p < 10−10 for all comparisons mentioned). Overall, these marked differences in IM copy number may be reflective of more direct modulation of the TME by cancer cells.
Among IMs under investigation for cancer therapy, expression of VISTA is relatively high in all tumor types and highest in MESO; BTLA expression is high in C4 and C5; HAVCR2 (TIM-3) shows evidence of differential silencing among immune subtypes; and IDO1 is amplified, mostly in C1. The observed differences in regulation of IMs might have implications for therapeutic development and combination immune therapies, and the multiple mechanisms at play in evoking them further highlights their biological importance.
Networks Modulating Tumoral Immune Response
The immune response is determined by the collective states of intracellular molecular networks in tumor, immune, and other stromal cells and the extracellular network encompassing direct interaction among cells and communication via soluble proteins such as cytokines to mediate interactions among those cells.
Beginning with a large network of extracellular interactions known from other sources, we identified which of those met a specified precondition for interaction, namely that both interaction partners are consistently present within samples in an immune subtype, according to our TME estimates. We focused the network on IMs. Networks in C2 and C3 had abundant CD8 T cells, while C3, C4, and C6 were enriched in CD4 T cells.
A small sub-network (Figure 7A), focused around IFN-γ, illustrates some subtype-specific associations. In both C2 and C3, CD4 T cells, CD8 T cells, and NK cells correlated with expression of IFNG and CCL5, a potent chemoattractant. A second sub-network (Figure 7B), centered on TGF-β, was found in the C2, C3, and C6 networks. Across subtypes, different cell types were associated with abundant expression of TGFB1: CD4 T cells and mast cells in C3 and C6, macrophages in C6, neutrophils and eosinophils in C2 and C6, and B cells, NK cells, and CD8 T cells in C2 and C3. The receptors known to bind TGF-β likewise were subtype specific and may help mediate the TGF-β-driven infiltrates, with TGFBR1, 2, and 3 found only in the C3 and C6 networks. These results largely echo findings seen in our TGF-β pathway analysis (Figure S4C), which examined the effects of intracellular, rather than extracellular, signaling disruption on immune TME composition across immune subtypes. Finally, a third cytokine subnetwork illustrates variation in T cell ligands and receptors across immune subtypes (Figure 7C). CD4 and CD8 receptors fell into two groups, those found in C2, C3, and C6 networks, such as PDCD1, and those absent in C3, such as IL2RA and LAG3. Some T cell-associated ligands were subtype specific, such as CD276 (C2, C6), IL1B (C6), and VEGFB (C4).
The derived extracellular networks reflect the properties of immune subtypes in terms of cellular propensities and immune pathway activation noted earlier (Figures 1B, 1C, 2A, and S2A), but also place those properties in the context of possible interactions in the TME that may play a role in sculpting those same properties. The particular associations observed among IMs within distinct subtypes may be important for identifying directions for therapy.
We next used two complementary approaches, master regulators (MRs) and SYGNAL, to synthesize a pan-cancer transcriptional regulatory network describing the interactions linking genomic events to transcriptional regulators to downstream target genes, and finally to immune infiltration and patient survival. In both approaches, somatic alterations were used as anchors to infer regulatory relationships, in that they can act as a root cause of the “downstream” transcriptional changes mediated through transcription factors (TFs) and miRNAs.
This resulted in two transcriptional networks. The first one, MR-PanImmune, consisted of 26 MRs that acted as hubs associated with observed gene expression and LF, connected with 15 putative upstream driver events (Figure 7D). The second one, SYGNAL-Panimmune, comprised 171 biclusters enriched in IMs and associated with LF.
Seven TFs were shared between the MR- and SYGNAL-PanImmune networks, a significant overlap (p = 4.8 × 10−10, Fisher’s exact test): PRDM1, SPI1, FLI1, IRF4, IRF8, STAT4, and STAT5A. Additional MRs included the hematopoietic lineage specific factor IKZF1, which may reflect variation in immune cell content, and known IMs, such as IFNG, IL16, CD86, and TNFRSF4. The regulators in SYGNAL-PanImmune were inferred to regulate a total of 27 IM genes (Figure S7C). The top two most commonly co-regulated IMs from SYGNAL-PanImmune, BTN3A1 and BTN3A2, are of particular interest as they modulate the activation of T cells (Cubillos-Ruiz et al., 2010) and have antibody-based immunotherapies (Benyamine et al., 2016, Legut et al., 2015).
Somatic alterations in AKAP9, HRAS, KRAS, and PREX2 were inferred to modulate the activity of IMs according to both the MR- and SYGNAL-PanImmune, a significant overlap (p = 1.6 × 10−7, Fisher’s exact test). In MR-PanImmune, MAML1 and HRAS had the highest number of statistical interactions with 26 MRs. This analysis identified complex roles for the RAS-signaling pathway (Figure 7D) specifically through connections to lineage factor VAV1 (implicated in multiple human cancers), potentially mediated by MAP2K1. Similarly, MAML1, hypothesized to mediate cross-talk across pathways in cancer (McElhinny et al., 2008), was associated (p ≤ 0.05) with multiple MRs, including STAT1, STAT4, CIITA, SPI1, TNFRSF4, CD86, VAV1, IKZF1, and IL16.
In SYGNAL-PanImmune, some regulators of IMs, but not upstream somatic mutations, were shared between tumor types, including STAT4, which regulated BTN3A1 and BTN3A2 in both LUSC and UCEC, secondary to implied causal mutations TP53 and ARHGAP35, respectively. Conversely, causal mutations shared across tumor types may associate with different tumor-specific downstream regulators. TP53 was a causal mutation in UCEC acting through IRF7 to regulate many of the same IMs as was seen in LUSC. These differences in causal relationships arise because the different cell types giving rise to each tumor type affect oncogenic paths.
We identified the putative regulators of immune gene expression within immune subtypes (Figure 7E). In these predictions, C1-associated biclusters were regulated by ERG, KLF8, MAFB, STAT5A, and TEAD2. C1 and C2 shared regulation by BCL5B, ETV7, IRF1, IRF2, IRF4, PRDM1, and SPIB, consistent with IFN-γ signaling predominance in these subtypes. C3 was regulated by KLF15 and miR-141-3p. C6-associated biclusters were regulated by NFKB2. C1, C2, and C6 shared regulation by STAT2 and STAT4, implying shared regulation by important immune TF families, such as STAT and IRF, but also differential employment of subunits and family members by the immune milieu.
In SYGNAL-PanImmune, the increased expression of biclusters enriched with IMs from KIRC, LGG, LUSC, and READ was associated with worse patient survival (CoxPH BH adjusted p value ≤ 0.05). Conversely, the increased expression of biclusters enriched with IMs from SKCM, containing CCL5, CXCL9, CXCL10, HAVCR2, PRF1, and MHC class II genes, were associated with improved patient survival (BH-adjusted p ≤ 0.05).
We report an extensive evaluation of immunogenomic features in more than 10,000 tumors from 33 cancer types. Data and results are available as Supplemental Tables, at NCI GDC, and interactively at the CRI iAtlas portal, which is being configured to accept new immunogenomics datasets and feature calculations as they come available, including those derived from immunotherapy clinical trials, to develop as a “living resource” for the immunogenomics community. Meta-analysis of consensus expression clustering revealed immune subtypes spanning multiple tumor types and characterized by a dominance of either macrophage or lymphocyte signatures, T-helper phenotype, extent of intratumoral heterogeneity, and proliferative activity. All tumor samples were assessed for immune content by multiple methods. These include the estimation of immune cell fractions from deconvolution of gene expression and DNA methylation data, prediction of neoantigen-MHC pairs from mutations and HLA-typing, and evaluation of BCR and TCR repertoire from RNA-sequencing data. Immune content was compared among immune and cancer subtypes, and somatic alterations were identified that correlate with changes in the TME. Finally, predictions were made of regulatory networks that could influence the TME, and intracellular communication networks in the TME, based on integrating known interactions and observed associations. Immunogenomic features were predictive of outcome, with OS and PFI differing between immune subtypes both within and across cancer types.
C4 and C6 subtypes conferred the worst prognosis on their constituent tumors and displayed composite signatures reflecting a macrophage dominated, low lymphocytic infiltrate, with high M2 macrophage content, consistent with an immunosuppressed TME for which a poor outcome would be expected. In contrast, tumors included in the two subtypes displaying a type I immune response, C2 and C3, had the most favorable prognosis, consistent with studies suggesting a dominant type I immune response is needed for cancer control (Galon et al., 2013). In addition, C3 demonstrated the most pronounced Th17 signature, in agreement with a recent systematic review suggesting that Th17 expression is generally associated with improved cancer survival (Punt et al., 2015). C2 was IFN-γ dominant and showed a less favorable survival despite having the highest lymphocytic infiltrate, a CD8 T cell-associated signature, and highest M1 content, suggesting a robust anti-tumor immune response. One explanation for this discrepancy is the aggressiveness of both the tumor types and specific cases within C2 relative to C3. C2 showed the highest proliferation signature and ITH while C3 was the lowest in both those categories. It may be that the immune response simply could not control the rapid growth of tumors comprising C2. A second hypothesis is that tumors in C2 are those that have already been remodeled by the existing robust type I infiltrate and have escaped immune recognition. While signatures biased toward interferon-mediated viral sensing and antigen presentation genes were often associated with higher survival, interferon signatures without increased antigen presentation showed an opposite association. Loss of genes associated with antigen processing and presentation is often found in tumors that have been immune edited. In contrast to the potential immune editing of C2, C3 may represent immunologic control of disease, that is, immune equilibrium.
Possible impact of somatic alterations on immune response was seen. For example, KRAS mutations were enriched in C1 and but infrequent in C5, suggesting that mutations in driver oncogenes alter pathways that affect immune cells. Driver mutations such as TP53, by inducing genomic instability, may alter the immune landscape via the generation of neoantigens. Our findings confirmed previous work showing that mutations in BRAF (Ilieva et al., 2014) enhance the immune infiltrate while those in IDH1 diminish it (Amankulor et al., 2017). Further work is needed to determine the functional aspects of these associations.
Tumor-specific neoantigens are thought to be key targets of anti-tumor immunity and are associated with improved OS and response to immune checkpoint inhibition in multiple tumor types (Brown et al., 2014). We found OS correlated with pMHC number in only a limited number of tumors, with no clear association in most tumors, including several responsive to immune checkpoint inhibitor therapy. There are some caveats to this finding. The current predictors are highly sensitive but poorly specific for neoantigen identification, and our approach did not include neoantigens from introns or spliced variants. Moreover, it is not possible to fully determine the ability to process and present an epitope or the specific T cell repertoire in each tumor, which impacts the ability to generate a neoantigen response. It is also possible that the role of neoantigens may vary with tumor type, as supported by our per-tumor results.
Integrative methods predicted tumor-intrinsic and tumor-extrinsic regulation in, of, and by the TME and yielded information on specific modes of intracellular and extracellular control, the latter reflecting the network of cellular communication among immune cells in the TME. The resulting network was rich in structure, with mast cells, neutrophils, CD4 T cells, NK cells, B cells, eosinophils, macrophages, and CD8 T cells figuring prominently. The cellular communication network highlighted the role of key receptor and ligands such as TGFB1, CXCL10, and CXCR3 and receptor-ligand pairs, such as the CCL5-CCR5 axis, and illustrated how immune cell interactions may differ depending on the immune system context, manifested in the immune subtype.
Predicted intracellular networks implied that seven immune-related TFs (including interferon and STAT-family transcription factors) may play an active role in transcriptional events related to leukocyte infiltration, and that mutations in six genes (including Ras-family proteins) may influence immune infiltration. Across tumor types, the TFs and miRNAs regulating the expression of IMs tended to be shared, while somatic mutations modulating those regulatory factors tended to differ. This suggests that therapies targeting regulatory factors upstream of IMs should be considered and that they may have a broader impact across tumor types than therapies focusing on somatic mutations. Of note, in these approaches, it is not always possible to fully ascertain whether some particular interaction acts in the tumor, immune, or stromal cell compartments, but this could be improved on by incorporating additional cell-type-specific knowledge. Shared elements of intra- and extracellular network models should also be explored, with particular regard to the IMs and cytokines in both.
There are important caveats to using TCGA data. First, survival event rates and follow-up durations differ across the tumor types. Second, for most tumor types, samples with less than 60% tumor cell nuclei by pathologist review were excluded from study, thus potentially removing the most immune-infiltrated tumors from analysis. The degree to which this biases the results, relative to the general population of cancer patients, is difficult to ascertain. Our analyses were also limited by restriction to data from genome-wide molecular assays, in the absence of targeted classical cellular immunology assays for confirming cell phenotype distribution, as those types of data have not been collected from TCGA patients.
In summary, six stable and reproducible immune subtypes were found to encompass nearly all human malignancies. These subtypes were associated with prognosis, genetic, and immune modulatory alterations that may shape the specific types of immune environments we have observed. With our increasing understanding that the tumor immune environment plays an important role in prognosis as well as response to therapy, the definition of the immune subtype of a tumor may play a critical role in the predicting disease outcome as opposed to relying solely on features specific to individual cancer types.
We are grateful to all the patients and families who contributed to this study. We also thank the Office of Cancer Genomics at the NCI for organizational and logistical support of this study. The high-throughput analyses in this study were performed on the Institute for Systems Biology-Cancer Genomics Cloud (ISB-CGC) under contract number HHSN261201400007C and on the Seven Bridges Cancer Genomics Cloud under contract HHSN261201400008C, with federal funds from the National Cancer Institute, NIH, Department of Health and Human Services. Funding from the Cancer Research Institute is gratefully acknowledged, as is support from NCI through U54 HG003273, U54 HG003067, U54 HG003079, U24 CA143799, U24 CA143835, U24 CA143840, U24 CA143843, U24 CA143845, U24 CA143848, U24 CA143858, U24 CA143866, U24 CA143867, U24 CA143882, U24 CA143883, U24 CA144025, and P30 CA016672. The study was supported by W81XWH-12-2-0050, HU0001-16-2-0004 from the US Department of Defense through the Henry M. Jackson Foundation for the Advancement of Military Medicine. We thank Peter Hammerman and Yasin Şenbabaoğlu for contributions in early phases of this work.
Analysis, Computation, and Software: V.T., D.L.G., S.D.B., D.W., D.S.B., T.O.Y., E.P.-P., G.F.G., C.L.P., J.A.E., E.Z., A.C.C., E.O.P., I.K.A.S., A.J.G., R.M., F.F., A. Colaprico, J.S.P., L.E.M., N.S.V., J.L., Y.L., V.D., S.M.R., R.B., A.D.C., D.B., A.R., A.K., H.H., T.M.M., H.N., C.S.P., S.B., A.I.O., A.L., W.Z., J.G., J.S., B.G.V. Supervision: J.R., A. Califano, D.A., K.C., H.S., T.K.C., J.N.W., J.G., R.A.H., B.G.V., I.S. Writing: V.T., D.L.G., S.D.B., D.W., D.S.B., T.O.Y., E.P.-P., G.F.G., C.L.P., E.Z., A.C.C., E.O.P., I.K.A.S., A.J.G., R.M., F.F., A. Colaprico, N.S.V., H.H., T.M.M., H.N., J.S., C.E.R., A.J.L., J.S.S., E.G.D., M.L.D., B.G.V., I.S.
Declaration of Interests
Michael Seiler, Peter G. Smith, Ping Zhu, Silvia Buonamici, and Lihua Yu are employees of H3 Biomedicine, Inc. Parts of this work are the subject of a patent application: WO2017040526 titled “Splice variants associated with neomorphic sf3b1 mutants.” Shouyoung Peng, Anant A. Agrawal, James Palacino, and Teng Teng are employees of H3 Biomedicine, Inc. Andrew D. Cherniack, Ashton C. Berger, and Galen F. Gao receive research support from Bayer Pharmaceuticals. Gordon B. Mills serves on the External Scientific Review Board of Astrazeneca. Anil Sood is on the Scientific Advisory Board for Kiyatec and is a shareholder in BioPath. Jonathan S. Serody receives funding from Merck, Inc. Kyle R. Covington is an employee of Castle Biosciences, Inc. Preethi H. Gunaratne is founder, CSO, and shareholder of NextmiRNA Therapeutics. Christina Yau is a part-time employee/consultant at NantOmics. Franz X. Schaub is an employee and shareholder of SEngine Precision Medicine, Inc. Carla Grandori is an employee, founder, and shareholder of SEngine Precision Medicine, Inc. Robert N. Eisenman is a member of the Scientific Advisory Boards and shareholder of Shenogen Pharma and Kronos Bio. Daniel J. Weisenberger is a consultant for Zymo Research Corporation. Joshua M. Stuart is the founder of Five3 Genomics and shareholder of NantOmics. Marc T. Goodman receives research support from Merck, Inc. Andrew J. Gentles is a consultant for Cibermed. Charles M. Perou is an equity stock holder, consultant, and Board of Directors member of BioClassifier and GeneCentric Diagnostics and is also listed as an inventor on patent applications on the Breast PAM50 and Lung Cancer Subtyping assays. Matthew Meyerson receives research support from Bayer Pharmaceuticals; is an equity holder in, consultant for, and Scientific Advisory Board chair for OrigiMed; and is an inventor of a patent for EGFR mutation diagnosis in lung cancer, licensed to LabCorp. Eduard Porta-Pardo is an inventor of a patent for domainXplorer. Han Liang is a shareholder and scientific advisor of Precision Scientific and Eagle Nebula. Da Yang is an inventor on a pending patent application describing the use of antisense oligonucleotides against specific lncRNA sequence as diagnostic and therapeutic tools. Yonghong Xiao was an employee and shareholder of TESARO, Inc. Bin Feng is an employee and shareholder of TESARO, Inc. Carter Van Waes received research funding for the study of IAP inhibitor ASTX660 through a Cooperative Agreement between NIDCD, NIH, and Astex Pharmaceuticals. Raunaq Malhotra is an employee and shareholder of Seven Bridges, Inc. Peter W. Laird serves on the Scientific Advisory Board for AnchorDx. Joel Tepper is a consultant at EMD Serono. Kenneth Wang serves on the Advisory Board for Boston Scientific, Microtech, and Olympus. Andrea Califano is a founder, shareholder, and advisory board member of DarwinHealth, Inc. and a shareholder and advisory board member of Tempus, Inc. Toni K. Choueiri serves as needed on advisory boards for Bristol-Myers Squibb, Merck, and Roche. Lawrence Kwong receives research support from Array BioPharma. Sharon E. Plon is a member of the Scientific Advisory Board for Baylor Genetics Laboratory. Beth Y. Karlan serves on the Advisory Board of Invitae.