In Silico analysis and linking of metabolism-related genes with the immune landscape in head and neck squamous cell carcinoma

Metabolic reprogramming and immunologic suppression are two critical characteristics promoting the progression of head and neck squamous cell carcinoma (HNSCC). The integrative analysis of all the metabolismrelated genes (MRGs) in HNSCC is lacking and the interaction between the metabolism and the immune characteristics also requires more exploration to uncover the potential mechanisms. Therefore, this study was designed to establish a prognostic signature based on all the MRGs in HNSCC. Genes of HNSCC samples were available from the TCGA and GEO databases while the MRGs were retrieved from a previous study. Ultimately 4 prognostic MRGs were selected to construct a model possessing robust prognostic value and accuracy in TCGA cohorts. The favorable reproducibility of this model was confirmed in validation cohorts from GEO databases. The risk score calculated by this model was an independent prognostic factor that further classified these HNSCC patients into high-/low-risk groups. GSEA analyses and somatic mutations indicated the low-risk group could activate several anti-tumor pathways and possessed lower TP53 mutation. The results of ESTIMATE, single-sample GSEA, CIBERSORT, and some immune-related molecules analyses suggested the low-risk group exhibited lower metabolic activities and higher immune characteristics. The Spearman correlation test implied most metabolic pathways with tumor-promoting function were negatively correlated with the immune activity, indicating a plausible approach of combining the anti-metabolism and the immunotherapy drugs in the high-risk group to enhance therapeutic effects than applied separately. In conclusion, this prognostic signature linking MRGs with the immune landscape could promote the individualized treatment for HNSCC patients.


Introduction
Head and neck squamous cell carcinomas (HNSCC) have a global annual incidence of approximately 600,000 cases, making it the sixth leading cause of cancer-related death (Ferlay et al., 2015). Despite the comprehensive alternatives for diagnosing and treating HNSCC, its prognosis remains far from optimal with a five-year survival rate of less than 50% (Bose et al., 2013). Given that HNSCC is a heterogeneous disease, patients within the same TNM stage might exhibit different molecular features and could benefit from individualized therapies (Leemans et al., 2018;Mroz et al., 2015). Therefore, to improve the clinical outcomes of HNSCC patients, it is of great value to explore more reliable prognostic biomarkers for distinguishing different risk patients from other perspectives serving as a supplement to the conventional TNM cancer staging.
Compelling evidence has suggested that sequencing technology combined with data mining are handy tools for building powerful prognostic models and classifying different subgroups in various cancers (Hu et al., 2020;Zhao and Cui, 2019;Chen et al., 2019). Considering the high energy and molecular demands of cancer cells, metabolic reprogramming plays a vital role in cancer progression and could contain numerous clinical prognostic indicators (Hsieh et al., 2019). Therefore, intensify the understanding of metabolic mechanisms in HNSCC possesses enormous benefits for exploring novel anti-cancer treatments. Some previous studies used glycolysis-related genes or glutaminolysis-related genes to identify prognostic index in HNSCC (Liu and Yin, 2020;Okazaki et al., 2019). However, the integrative analysis of all the metabolism-related genes (MRGs) in HNSCC has not been conducted yet. Consequently, it is worth exploring all the MRGs with prognostic value in HNSCC to identify novel therapeutic targets. Biswas (2015) suggested that metabolic reprogramming could also impact the immune functions of active immune cells. In light of the crucial role of immune cells in cancer progression, it is important to further study the correlation between the metabolism and the immune characteristics in HNSCC. Therefore, in this study, we constructed a metabolism-related prognostic signature containing 4 MRGs selected from all the MRGs in HNSCC and further classified the patients into the high-/low-risk groups accordingly. We also performed a multi-omics analysis on this gene classifier to explore the difference between the high-/low-risk groups. Finally, we investigated the correlation between the metabolism and the immune characteristics in HNSCC. These findings might advance the personalized diagnosis and treatment in HNSCC.

Patient data extraction
Gene expression data and corresponding patient clinical data for HNSCC were downloaded from The Cancer Genome Atlas (TCGA) database (https://portal.gdc.cancer.gov/). Altogether, 502 tumor tissues and 44 adjacent normal tissues were embodied in the HNSCC cohort. Some patients were excluded for not complying with the inclusion criteria of having and surviving a primary tumor for more than 30 days. Eventually, 490 tumor tissues and 44 adjacent normal tissues were included in this study. An external validation series containing the GSE41613 and GSE65858 data set was downloaded from the Gene Expression Omnibus (GEO) database. The microarray data of GSE41613 were based on the Affymetrix Human Genome U133 Plus 2.0 Array platform and that of GSE65858 stemmed from Illumina HumanHT-12 V4.0 expression beadchip. Tab. 1 exhibited specific baseline information.
Development and validation of the metabolism-related prognostic signature The MRGs were obtained from the previous study (Possemato et al., 2011). These MRGs were further filtered using genes from the selected TCGA dataset and GEO datasets to pick out those appearing simultaneously in all datasets. Then these selected genes were analyzed by DESeq2 R package with filter criteria of adjusted P < 0.05 and absolute log2-fold change >1.5 to identify the differentially expressed metabolism-related genes (DEMRGs) between HNSCC and normal controls in TCGA datasets. The correlation between the expression level of DEMRGs and the patient's overall survival (OS) was evaluated by the univariate Cox analysis with P < 0.01 being considered as statistically significant. Then these prognostic DEMRGs were initially filtered using the least absolute shrinkage and selection operator (LASSO) method. Subsequently, we used bootstrapping (N = 1000) to further select the prognostic DEMRGs with stable results during the 1000 repetitions by the Cox regression analysis. The parameter for the Cox regression analysis was the default parameters in the 'Survival' R package. These specific DEMRGs were finally applied to construct a metabolism-related prognostic model based on the Akaike information criterion (AIC) using the stepwise multivariate Cox regression analysis. AIC is a wellrecognized means for measuring the goodness of fit of the statistical models. AIC encourages the goodness of data fitting and helps avoid overfitting in the meanwhile. The priority model should be the one with the lowest AIC value. Therefore, we constructed the final model based on AIC to ensure the optimal goodness of fit of the established model.
The optimal cut-off point of the risk score calculated by the above model was determined using the Survminer R package to divide the patients into the high-/low-risk groups. The survival rates of patients in the high-/low-risk groups were measured utilizing the Kaplan-Meier survival curve. The time-dependent receiver operating characteristic (ROC) curves were drawn to assess the sensitivity and specificity of the model. Dataset GSE41613 with 96 oral squamous cell carcinoma patients and GSE65858 with 251 HNSCC patients were selected as the validation cohort. The same procedures were operated in these two external datasets to investigate whether the established model could effectively predict survival in HNSCC.
The metabolism-related signature as an independent prognostic factor Univariate and multivariate Cox proportional hazards models were applied to estimate the hazard ratios (HRs) and 95% confidence intervals (CIs) for the risk of HNSCC mortality. A stratified multivariate Cox regression analysis based on the risk status was also performed. Besides, given that the human papillomavirus (HPV) status could influence the gene expression, mutational profile, metabolic regulation, and immune modulation of HNSCC, we further compared the HPV status in the high-/low-risk groups to evade the bias resulting from this potential confounding factor.

Somatic mutations and GSEA analysis
In light of gene mutation could drive tumor metabolic reprogramming, we compared the somatic mutations between different groups stratified by this metabolism-related model to uncover potential mechanisms leading to distinct prognosis. The somatic mutations data of HNSCC patients were downloaded via the TCGAbiolinks and the data was visualized by the "maftools" R package. In addition, we also explored the activated signaling pathways between high-/lowrisk groups. To be specific, we applied the DEseq2 R package to obtain an ordered list of genes and conducted the subsequent Gene Set Enrichment Analysis (GSEA) by the ClusterProfiler R package with adjusted P < 0.05.

Estimation of metabolism and immune characteristics as well as their correlation
Scores of 26 metabolic pathways were retrieved from https:// choih.shinyapps.io/metabolicsignatures (Choi and Na, 2018). Immune scores and stromal scores were calculated by applying the ESTIMATE algorithm which can report the enrichment of immune and stromal cell gene signatures (Yoshihara et al., 2013). Moreover, the single-sample gene-set enrichment analysis (ssGSEA) algorithm was adopted to quantify the relative abundance of each cell infiltration in the HNSCC tumor microenvironment (TME). The gene set for marking each TME infiltration immune cell type was obtained from the study of Charoentong et al. (2017) which demonstrated diverse human immune cell subtypes such as CD8 T cell, activated dendritic cell, macrophage, natural killer (NK) T cell, regulatory T cell, and so on. Subsequently, the CIBERSORT analysis was performed to estimate the fraction of the immune cell infiltration. Additionally, the explorations of some immunerelated molecules including immune checkpoint genes, cytotoxic effectors, and an "interferon-gamma (IFNG) signature" were conducted in the high-/low-risk groups. Finally, the association between metabolic pathways and immune cell infiltrations was assessed by the spearman correlation test.

DEMRGs in HNSCC
There were 2113 common MRGs in these selected datasets (Fig. 1A). Further analysis in the HNSCC TCGA dataset revealed 991 DEMRGs with 383 up-regulated and 608 down-regulated. Fig. 1B visualized the distribution of these DEMRGs between HNSCC and normal controls. Fig. 1C was the expression heatmap of these DEMRGs, in which significantly up-regulated genes and down-regulated genes were each represented by red and blue.
Construction and validation of a metabolism-related signature A total of 55 possible prognostic DEMRGs were preliminarily identified by univariate Cox analysis while 11 of them remained after being filtered by LASSO ( Fig. 2A). Ultimately, 4 DEMRGs (GRIA3, PYGL, HPRT1, and SLC23A1) were selected after bootstrapping and they were applied to the final prediction model by stepwise multivariate Cox regression analysis. The imputed risk score from the metabolism-related signature was calculated using the following formula: (−0.38167 × expression level of GRIA3) + (0.31221 × expression level of PYGL) + (0.42102 × expression level of HPRT1) + (−0.50125 × expression level of SLC23A1). Patients were classified into the high-/low-risk groups according to the optimal cut-off of their risk scores (Fig. 2B).
To be specific, patients with high risk tended to die earlier than those with low risk (Fig. 2C). Fig. 2D presented the survival heatmap, demonstrating GRIA3 and SLC23A1 to be the protective MRGs that were highly expressed in the lowrisk group. On the contrary, PYGL and HPRT1 were highly expressed in the high-risk group, standing for the riskassociated MRGs. Kaplan-Meier curves for the high-/low-risk groups were exhibited in Fig. 2E, implying that the high-risk group had a poorer prognosis than the low-risk group. The area under the curve (AUC) of the 3-year ROC curve was 0.702, suggesting good predictive performance for the 3-year OS (Fig. 2F). Similar analyses were conducted in other independent HNSCC series obtained from GEO (GSE41613 and GSE65858) to further validate the predictive value of this metabolism-related signature and confirm its reproducibility. The risk scores for 96 patients in the GSE41613 dataset and 251 patients in the GSE65858 dataset were imputed using the same formula as above. Consistently, patients with higher risk showed poorer prognosis in both the GSE41613 dataset (Fig. 3A) and the GSE65858 dataset (Fig. 3B). What's more, the sensitivity and specificity evaluation of our metabolism-related signature for the 3-year OS in GSE41613 and GSE65858 was 0.642 and 0.573, respectively (Figs. 3C and 3D). These validation results of GEO datasets led to an identical conclusion with the initial analysis of the TCGA dataset, suggesting favorable reproducibility of this metabolismrelated signature in HNSCC.
The risk score of the metabolism-related signature was independently associated with HNSCC mortality As seen in Tab. 2, Cox proportional hazards models adjusted for different variables indicated that patients with high risk had significantly higher mortality (2.85 [2.07, 3.93], P < 0.0001 in Model I; 2.81 [2.03, 3.88], P < 0.0001 in Model II) in comparison with those in the low-risk group. A stratified multivariate Cox analysis was carried out to further verify the robust relevance between the risk score of the metabolism-related signature and the mortality of these HNSCC patients. The results in Tab. 3 showed that the high-risk score was associated with high mortality of HNSCC in each subgroup (of age, sex, grade, and stage) except for the subgroup of Stages I + II. No interactions between risk score and other factors were found. Besides, there was no significant difference in the relative distribution of HPV (+) and HPV (−) between the high-/low-risk groups (Suppl. Fig. S1). In other words, HPV (+) tumors did not show enrichment in either population.
The high-/low-risk groups with distinct somatic mutations and signaling pathways Somatic mutations were analyzed using TCGA database. The TP53 mutation was significantly enriched in the high-risk group. Mutations in CDKN2A, PLEC, CASP8, FAM135B, AHNAK, NSD1, PKDH1L1, S1, and FAT3 also exhibited different statuses in different risk groups (Fig. 4).
GSEA analysis discovered five significant KEGG pathways and two hallmark pathways activated in the low-risk group, including linoleic acid metabolism, drug metabolism cytochrome P450, cell adhesion molecules (CAMs), primary immunodeficiency, neuroactive ligandreceptor interaction, allograft rejection, and KRAS signaling DN (Fig. 5). Consequently, it is reasonable to assume that the activation of these pathways could elucidate the underlying mechanisms of the better prognosis in the low-risk group from diverse aspects including metabolism.
The high-/low-risk groups with distinct metabolic characteristics Given that the prognostic model was constructed based on metabolism-relevant genes, a feasible next step in delineating the underlying mechanisms is to explore the metabolic characteristics corresponding to different risk scores. As a result, scores of 26 metabolic pathways were acquired from the previous study while the subsequent analysis revealed a significant correlation between high risk and high metabolic characteristics. Glycolysis, glycogenolysis, glucose metabolism, gluconeogenesis, pyruvate metabolism, carbohydrate metabolism, steroid synthesis, amino acid (AA) metabolism, protein metabolism, purine metabolism, pyrimidine metabolism, ribonucleotide synthesis, mRNA metabolism, RNA metabolism, and nitric oxide metabolism were all higher in the high-risk group (Fig. 6). Similar results were obtained in the GSE41613 dataset (Suppl. Fig. S2A) and the GSE65858 dataset (Suppl. Fig. S3A).
The high-/low-risk groups with distinct immune characteristics It was well accepted that metabolic reprogramming was critically associated with the immune microenvironment, so we further studied the immune characteristics of the high-/low-risk groups. Results demonstrated that the immune score and the stromal score of the high-risk group were both lower than those of the low-risk group (Fig. 7A).
Since the immune scores between the high-/low-risk groups were significantly different, the relative abundance of each TME infiltrating cell was further investigated with ssGSEA analysis. Accordingly, groups with different risks exhibited distinct immune infiltration. Specifically, 16 immune cell populations (activated B cell, activated CD8 T cell, activated dendritic cell, CD56 dim NK cell, eosinophil, immature B cell, macrophage, mast cell, MDSC, monocyte, NK cell, plasmacytoid dendritic cell, regulatory T cell, T follicular helper cell, Type 1 T helper cell, and Type 17 T helper cell) showed higher abundance in the low-risk group while CD56 bright natural killer cell showed higher abundance in the high-risk group (Fig. 7B). What's more, the fraction of active immune cells like CD8+ T cell and follicular helper T cell was higher in the low-risk group, whereas the proportion of inactive immune cells like resting mast cell and naive CD4+ T cell was higher in the high-risk group (Fig. 7C).
At last, the comparisons of some immune-related molecules between the high-/low-risk groups were achieved. Altogether, 32 immune checkpoint genes described in previous studies Yu et al., 2020) were analyzed in this study and the results revealed that the expression levels of 22 immune checkpoint genes were higher in the low-risk group such as CTLA4, CD40, ICOS, TIGIT, and so on (Fig. 7D). Besides, cytotoxic effectors (GZMA, GZMB, GZMH, GZMK, and PRF1) and four out of six genes in the IFNG signature (CXCL9, HLA-DRA, IDO1, and IFNG) were higher in the low-risk group (Fig. 7E). Similar results were obtained in the GSE41613 dataset (Suppl. Figs. S2B-S2E) and the GSE65858 dataset (Suppl. Figs. S3B-S3E).

The correlation between the metabolism and the immune characteristics
Given that the high-risk group demonstrated higher metabolic activity and lower immune activity when analyzed separately, we further explore the correlation between metabolic pathways and immune cell infiltrations by the spearman correlation test. Consistent with the above unilateral results, most major metabolic processes promoting tumor progression had significantly negative correlations with the immune cells exerting antitumor function. For instance, the high-risk group possessed higher activity of glycolysis as well as lower infiltration of activated B cell, activated CD8 T cell, and activated dendritic cell. Correspondingly, the correlations between glycolysis and the above immune cell infiltrations were significantly negative as expected (Fig. 8).

Discussion
With the continuously updating understanding of molecular heterogeneity inherent within HNSCC, emerging biomarkers with prognostic value have been identified by genomic expression analyses from diverse perspectives such as transcription factors, lymph node metastases, radiomics, and so on (Zhang et al., 2019;Huang et al., 2019b;Huang et al., 2019a;Tonella et al., 2017;Chung et al., 2004). Considering metabolic reprogramming as a hallmark of cancer, some previous studies also constructed prognostic models based on genes related to specific metabolic pathways (Liu and Yin, 2020;Okazaki et al., 2019). To fully explore the metabolism reprogramming in HNSCC, we expanded the study scope to all the MRGs in HNSCC. Finally, a model containing 4 MRGs with favorable prognostic value was established using TCGA dataset and was further validated in two independent GEO datasets. This prognostic model could also serve as a gene classifier dividing the patients into different risk groups with altered metabolism and immune characteristics.

FIGURE 7. (continued)
FIGURE 7. Immune characteristics in the high-/low-risk groups (ns, non-significant, *P < 0.05, **P < 0.01, ***P < 0.0001, ****P < 0.0001). proliferation and migration of pancreatic cancer (Ripka et al., 2010), which seems to be contradicted with our results in HNSCC. Therefore, more functional investigations of GRIA3 in various cancers are necessary. The rest 2 DEMRGs act as risk-associated MRGs for HNSCC in this study, namely hypoxanthine-guanine phosphoribosyltransferase1 (HPRT1) and glycogen phosphorylase L (PYGL). Consistently, HPRT1 is not only widely used to assess cancer risk and determine possible carcinogens but also has the potential to act as a cancer biomarker and neoantigen with its overexpression leading to the dysregulation of nucleotide synthesis and protein production in the malignant environment (Townsend et al., 2018). Meanwhile, as a key role in glycogen metabolism, PYGL is up-regulated in most tumors to sustain cancer cell proliferation and is also identified as a novel p53 candidate target gene in various cancers as well as a metastasisrelated metabolic gene in prostate cancer (Chen et al., 2018;Garritano et al., 2013;Favaro et al., 2012). In this study, TP53 mutation was significantly higher in the high-risk group, which is consistent with the tumor suppressor role of TP53. Mutations of TP53 are the most frequent among all the somatic genomic alterations in HNSCC and have a significant correlation with poor outcomes in HNSCC patients indicated by short survival time and tumor resistance to radiotherapy and chemotherapy (Zhou et al., 2016). Therefore, TP53 mutation might also serve as an effective prognostic biomarker. GSEA analysis indicates that several pathways with anti-tumor activities are activated in the low-risk group, linoleic acid metabolism, CAMs, and KRAS signaling DN for instance. Linoleic acid has been identified as an anti-carcinogenic fatty acid with anti-obesogenic effects and anti-atherosclerotic properties (den Hartigh, 2019). CAMs are proteins that mediate cell-tocell adhesion and cell-to-extracellular matrix interactions, which are essential to maintain homeostasis in healthy tissues (Janiszewska et al., 2020). The downregulation of KRAS signaling could also ubiquitously inhibit the activity of most tumors (Sexton et al., 2019). These might partly explain why low-risk patients exhibit a longer survival time.
The analysis results of 26 metabolic pathways reveal that the high-risk group tends to exhibit higher activity in some major metabolic processes such as glycometabolism, lipid metabolism, and AA metabolism. Increased glucose uptake and enhanced glycolysis are well-recognized hallmarks in HNSCC to meet the energy demands of highly proliferating cells (Yamamoto et al., 2017). Essential roles exerted by specific lipids in promoting growth and metastasis of cancer cells have been gradually unveiled in a myriad of research (Beloribi-Djefaflia et al., 2016), and our team also defined a 3-lipid metabolism-related genes signature as a biomarker for prognostic prediction in oral squamous cell carcinoma previously (Hu et al., 2019). Besides, the serum levels of specific amino acids such as valine, tyrosine, serine, and methionine were higher in HNSCC patients (Yonezawa et al., 2013). Therefore, tremendous efforts have been undertaken to develop anti-cancer schemes by targeting these hyperactive metabolic pathways in HNSCC with the hope of improving therapeutic efficiency (Boroughs and DeBerardinis, 2015).
Accumulating evidence has revealed the close relationship between metabolic characteristics and immune cells (Biswas, 2015). Considering the vital roles of immune landscapes in cancer progression and the immunosuppressive state of HNSCC, it is necessary to explore the immune microenvironment of different risk groups with different metabolism characteristics (Kang et al., 2015). As we expected, patients with low risk possess higher immune activity. The evaluation of immune infiltration shows a higher abundance of 16 immune cell populations in the low-risk group, further accompanied by higher fractions of active immune cells like activated B cell, activated CD8 T cell, and so on. On the contrary, patients from the high-risk group had higher proportions of inactive immune cells, confirming the promotion role of immunosuppressive status in HNSCC. Generally, the presence of abundant tumor-infiltrating lymphocytes is related to the improved prognosis due to immune activation (Canning et al., 2019). In other words, the dysfunction of the T cell, impairment of the NK cell activity, and a whole declination of the lymphocyte counts could deteriorate the prognosis in HNSCC patients (Solomon et al., 2018).
Immunotherapy provides a new option for the treatment of HNSCC, while currently, only a small percentage of patients respond well to immunotherapy (Oliva et al., 2019). Targeting checkpoint pathways achieves anti-tumor immunity in TME and encourages the development of potential targets in immunotherapy such as CTLA-4, TIGIT, ICOS, and CD40 (Marin-Acevedo et al., 2018). Moreover, as a critical role for host defense and tumor surveillance, IFNG can enhance the activity of some cytotoxic effectors and is proposed to be an important driver of programmed death ligand-1 (PD-L1) expression in HNSCC (Ayers et al., 2017). What's exciting is that most immune checkpoint genes, cytotoxic effectors, and genes in the IFNG signature within our study have a higher expression in the low-risk group, which might predict promising results of immunotherapy in the low-risk group.
The last but not the least, our study suggested that many metabolic pathways with tumor-promoting function were negatively correlated with the immune activity, leading to the state of metabolic activation and immunologic suppression in the high-risk group. This immunosuppressive state to a higher degree might contribute to immune evasion in the high-risk group. The negative correlation between metabolism and immune here could be ascribed to the possibility that active metabolism would hinder the nutritional supply of immune cells. Therefore, it might be plausible to adopt a combination of anti-metabolism and immunotherapy drugs in the high-risk group to achieve better therapeutic effects than applied separately.
It is interesting that within the scope of our research, HPV (+) was not enriched in either the high-risk or the low-risk group, but the low-risk group tended to exhibit similar characteristics with HPV (+) HNSCC such as fewer p53 mutations and altered metabolism as well as immune landscape. Nevertheless, changes towards specific metabolic pathways or expression of immune genes were not exactly the same in the low-risk group and the HPV (+) population. Considering the samples of the HPV analysis are not big enough in this study, larger samples are necessary to further determine the association between the HPV (+) status and the low-risk group stratified by this model.
In summary, this study generates a robust metabolismrelated prognostic model that classifies the HNSCC patients into the high-/low-risk groups each with different somatic mutations, signaling pathways, metabolic characteristics, and immune landscapes. The correlations between major metabolic pathways and the immune infiltrations also verify that the metabolic changes could impact the immune functions during cancer progression. These results provide a compelling rationale for MRGs to serve as candidate prognostic biomarkers and gene classifiers in HNSCC. Furthermore, this study offers a preclinical proof of concept that biomarker-driven cancer therapy and individualized treatment for HNSCC patients have promising prospects.