Immune-related DNA methylation signature associated with APLN expression predicts prognostic of hepatocellular carcinoma
1School of Life Science and Engineering, Southwest Jiaotong University, Chengdu, 610000, China
2Department of Engineering Structure and Mechanics, School of Science, Wuhan University of Technology, Wuhan, 430070, China
3State Key Laboratory of Trauma, Burns and Combined Injury, Research Institute of Surgery, Daping Hospital, The Army Medical University, Chongqing, 400042, China
4Renmin Hospital of Wuhan University Nursing Department, Wuhan, 430070, China
*Address correspondence to: Wenli Zeng, Zengwenli600@163.com
Received: 18 November 2021; Accepted: 11 February 2022
Abstract: This study used transcriptome and epigenetic data to predict the prognosis of immune-related genes (IRGs) Apelin (APLN) in patients with hepatocellular carcinoma (HCC). The TCGA database has gene expression and clinical data for HCC. And DNA methylation 450 k data for HCC was download from the University of California Santa Cruz (UCSC) Xena browser. Performing clinical and prognostic analysis of APLN expression, results show that APLN is highly expressed in tumor samples. And it has an increasing trend with the development of clinical stage and T stage. To explore the prognostic role of APLN, the Immune-related DNA methylation (DNAm) sites associated with APLN analyzed by bioinformatics. Univariate COX screened the methylation sites that are related to both APLN and survival. The risk score related to methylation site signature was determined according to their least absolute shrinkage and selection operator (LASSO) coefficients. Then the patients were divided into high-risk groups and low-risk groups. Significant differences in overall survival (OS) were found in the training cohort. Nomogram shows that APLN or methylation signature can effectively predict the prognosis of HCC patients. In summary, APLN may be a diagnostic and prognostic marker for HCC.
Keywords: TIME; Immunity; Prognosis; Bioinformatics
|LASSO:||least absolute shrinkage and selection operator|
|TCGA:||the cancer genome atlas|
|WGCNA:||weighted gene co-expression network analysis|
|UCSC:||University of California, Santa Cruz|
Hepatocellular carcinoma is one of the common primary tumors, causing more than 700,000 deaths each year. Tobacco, heredity, epigenetic factors will affect its development. Although many detection and treatment methods of computer tomography (CT) and emerging carbon ion radiation have made good progress (Malouff et al., 2020). HCC patients are prone to recurrence and metastasis of hemangioma after surgery or ablation, causing the poor prognosis (Korean Liver Cancer Association and National Cancer Center, 2019; Brown et al., 2019; Muto et al., 2014). Currently, the study’s diagnostic markers are limited and only for some populations (Rawat et al., 2018). Therefore, it is of great significance to study specific diagnostic biomarkers to predict the survival and diagnosis of HCC.
Epigenetic regulation does not change DNA sequence but changes the transcription mechanism and affects the tumor’s progress (Sharma et al., 2020; Dzobo et al., 2020; Baylin and Jones, 2011). Among them, DNA methylation (5-methylcytosine DNAm, m5C) and RNA methylation (6-methyladenine RNAm, m6A) are two crucial nucleic acid modifications, which are epigenetic important regulation mechanism (Liu and He, 2020). DNAm modification affects gene transcription (alternative splicing or translation), leading to abnormal gene expression and causing diseases. DNA methylation sites were served as prognostic markers in colorectal cancer and kidney cancer with high sensitivities, e.g., cg10673833 and cg04448376 (Luo et al., 2020; Wang and Zhang, 2020; Ibrahim et al., 2011; Ashokan et al., 2021). Compared with a single omics biomaker to explore HCC prognosis, the strategy combining DNA methylation data and transcriptomics’ data is believed to be more promising.
The APLN gene locates on chromosome Xq25-q26.1 and can encode 77 amino acid pro-peptide Apelin (Xu et al., 2017). APLN may be a potential marker in ovarian cancer, colorectal cancer, kidney cancer and other cancers (Chen et al., 2021; Ran et al., 2021; Bai et al., 2020; Zuurbier et al., 2017). Not only can APLN gene activate the pathway of APLN/APLNR to promote the HCC development, but also it activates different signal pathways or plays anti-apoptotic and anti-inflammatory functions by promoting tumor angiogenesis (Cabiati et al., 2021; Wang et al., 2020b). Many factors can affect the development of HCC. Immature and unstable tumor vasculature is associated with abnormal tumor microenvironment, which is liable to pose drug resistance during the anti-cancer treatment. The new peptide apelin and its receptor can enhance immune efficacy by inducing the morphological and functional maturation of tumor blood vessels (Kidoya et al., 2012). APLN/APLNR will be up-regulated during tumor angiogenesis. And APLN stimulates the proliferation, migration and invasion of colon adenocarcinomas (Picault et al., 2014; Podgorska et al., 2018). In addition, increased expression of APLN has been observed in muscle-invasive bladder cancer and is associated with poor clinical outcomes (Yang et al., 2019). Studies found that epigenetic modification and Expression of APLN can affect the progression of lung cancer (Miller et al., 2018). The degree of CpG island DNAm in APLN is related to the infiltration of pneumonia cells. And the DNA methylation sites in important regions can regulate gene function (Mishra et al., 2015). APLN can inhibit tumor progression by regulating Tregs immune cells’ expression and VCAM-1 (Amoozgar et al., 2019). We studied the expression changes of APLN in the development of HCC and its ability to predict prognosis. The methylation sites signatures in epigenetics are used to predict the viability of HCC, which is closely related to APLN.
Our team determined the molecular interaction mechanisms and biomarkers of certain diseases through bioinformatics methods (Tian et al., 2015). This study is based on the transcriptome, genetic data of the TCGA dataset and the UCSC dataset to study the prognosis of HCC. First, we used GEPIA and TIMER to screen out suitable genes. Then we screened out APLN gene through PubMed. CIBERSORT’s deconvolution can estimate the abundance of 22 kinds of immune cells and differential immune cells. Immune cells and checkpoints are used as clinical traits for WGCNA. And the univariate Cox, Lasso Cox were utilized to construct the risk score. The Nomogram was used to identify the prognostic ability of DNAm signature. The flow chart of this research is shown in the Fig. 1.
Materials and Methods
HCC data and preprocess
Gene expression data (N = 422) and clinical data (N = 377) of HCC were download from the TCGA database. The Illumina Infinium HumanDNAm450 data in the USUC database (http://xena.ucsc.edu/) was used for the screening of DNA methylation sites. The Beta value matrix data is composed of genomic Matrix (ROWs: site identifier and COLUMNs: sample). Immune genes were download from the Immport database. We deleted samples with survival time less than or equal to 1 day and DNAm Beta Value missing greater than 70%. GEPIA and ENCORI screened out the genes based on survival and expression data of HCC.
APLN and its expression and survival
In this study, 365 overlapping genes in the TCGA database and the Immport database were used for subsequent analysis. GEPIA (http://gepia2.cancer-pku.cn/) and ENCORI (http://starbase.sysu.edu.cn/panGeneSurvivalExp.php) websites used the “limma” method to analyze overlapping genes. DEGs between the normal group and the cancer group were retained, and genes with a significant difference in survival in the K-M curve (P < 0.05) were retained. Then we conducted literature research through PubMed, and deleted genes not related to immunity or HCC.
Analysis of tumor immune microenvironment
A certain proportion of immune cells in cancer samples were analyzed for immune infiltration. According to the median expression of APLN, tumor samples were divided into high expression group and low expression group. The tumor microenvironment immune cell infiltration was analyzed by CIBERSORT (https://cibersort.stanford.edu/) that based on the deconvolution of gene expression and linear support vector regression. It evaluates the correlation between cell types’ abundance and survival rate in 22 immune cell molecular subtypes. Then we screened out the types of immune cells with significant differences.
Key immune methylation sites
(1) To obtain the methylation sites related to APLN gene, the univariate COX regression model was used. (2) The overall survival (OS) and survival status were jointly-used as dependent variables. And the Univariate COX regression model with the methylation sites were performed again. Then we obtained the methylation sites that are significantly related to prognosis. (3) Venn analysis was implemented to yield the methylation sites related to both the APLN and survival. (4) According to the weighted gene co-expression network analysis, i.e., WGCNA, the essential modules and the immune-related methylation sites were acquired. On the previous step, we used WGCNA to determine the relationship with 3 immune cells and programmed cell death ligand 1 (PD-1), PD-L1, cell Toxic T lymphocyte-associated protein 4 (CTLA-4), the costimulatory molecule B7-1 immune checkpoint had a significant correlation with the key immune methylation sites. Finally, the highest positive/negative correlation module was selected, deemed the most relevant module for immunity.
Methylation sites signature and risk score
We used LAASO COX to analyze the key immune methylation sites. Then we obtained the signature of the methylation site and the LASSO risk coefficient. Kaplan-Meier (K-M) curve and ROC curve verify the accuracy and predictive power of the risk scoring model. 1, 3, and 5 years of AUC value was judged whether the methylation sites signature has diagnostic value. The LASSO COX retains all feature variables through regularization and reduces all parameters in the loss function. Thereby reducing the loss of feature information and preventing data overfitting. As shown in formula (1):
Among them is the L1 regularization term; m is the number of samples; k is the number of parameters; γ is the target of the balanced fitting training and the regularization parameter that keeps the parameter value small.
Prognostic value of APLN and risk score
In the univariate regression analysis, “forestplot” R package analyzed the impact of APLN and risk score on survival. The nomogram analyzed the prognosis of APLN and methylation site signatures. The 1, 3, and 5-year predicted survival status results are compared with the actual 1, 3, and 5-year survival status to determine the prognostic value of APLN and risk score.
All data statistical processing uses R version 3.6.1 software. Kaplan-Meier curves were generated uses the “survival” R software package. The R package “timeROC” performs the ROC analysis to assess the predictive performance of APLN gene and the DNA methylation sites signature. The R package “glmnet” implements the Lasso analysis that performs 10-fold cross-validation to obtain the best Lambda value (minimum Lambda value). The ward.D algorithm obtains the relationship between 22 immune cells. For all calculated statistical P values, it is statistically significant to set P < 0.05.
Identify the APLN gene
The screening process of APLN gene is shown in the Fig. 2. TCGA database and Immport database had 365 overlapping genes. The expression of 9 genes in HCC had significant impact on survival. We Deleted genes unrelated to immunity and genes unrelated to HCC.
The expression of APLN in HCC is related to clinical features and prognosis
The expression of APLN was significantly up-regulated in 371 tumor samples (Fig. 3A; P < 0.05). APLN tends to increase with the extent of the primary tumor and the grade of cancer tissue (Figs. 3B and 3C, P < 0.05). The Kaplan-Meier curve in high expression group shown a lower survival rate than low expression one (Fig. 3D). And the effective area under the ROC curve (AUC1 = 0.64, AUC3 = 0.61, AUC5 = 0.55) shows that APLN predict patient survival greatly (Fig. 3E). The Univariate Cox shown that 66365 DNA methylation sites relate to the expression of APLN and 46208 DNA methylation sites related to survival. A total of 7,229 important DNA methylation sites overlapped in the above two results (Fig. 3F).
The relationship between APLN and infiltrating immune cells
The median expression of APLN divides the samples into high expression group and low expression one. Immune infiltration in the immune microenvironment describes the pattern of immune cells. The proportion of tumor immune cells is significantly different between the high expression and low expression. The highest proportions are T cell CD4 memory resting (23.37%), Macrophages M2 (18.57%), and Macrophages M0 (11.75%) (Fig. 4A). APLN affects expression of CD4 T cell memory resting (P < 0.01), T cell follicular helper (P = 0.001), Tregs (P = 0.002). Three tumor immune cells have a significant impact (Fig. 4D). The relative content of different tumor immune cells in the sample is different (Fig. 4B). Spearman method analysis of 22 tumor immune cells reveal the correlation between 22 tumor immune cells (Fig. 4C).
Key methylation sites associated with immune infiltrating cells
According to univariate COX regression, 19754 methylation sites related to APLN and 56,106 methylation sites related to survival were obtained. The WGCNA analyzed overlapping 6797 DNA methylation sites in 365 samples. HCC is closely related to inflammation, immune cells and immune checkpoints may have important therapeutic value. PD-L1 is the main checkpoint in the tumor immune microenvironment (Han et al., 2020). And the PD-1/PD-L1 axis is responsible for cancer immune escape and has a huge impact on cancer treatment (Gao et al., 2009). When the threshold (power) is 26, it is closer to the non-network scale distribution (Figs. 5A and 5B). There are 7 highly correlated modules, of which turquoise and brown modules include 106 and 27 important DNA methylation sites (Fig. 5C).
Construction and verification of risk score
LASSO COX regression analysis of the above 133 methylation sites finally identified 10 important methylation site signatures (cg02316066, cg06194738, cg07311615, cg07570723, cg15629460, cg16987524, cg18437792, cg21860560, cg22580629). (Figs. 6A and 6B). And the best LASSO coefficient (Table 1). The score is determined based on the signature of 10 methylation sites, risk score = cg02316066 × (–2.19) + cg06194738 × 0.07 + cg07311615 × 0.24 + cg07570723 × (0.99) + cg15629460 × 1.99 + cg16987524 × 0.10 + cg18437792 × (0.04) + cg21860560 × 0.98 + cg22580629 × (−0.09). The results of the K-M curve show that the proportion of high risk in the cancer group is high (P = 0.00014 (Fig. 6C)) and good predictive effect (AUC1 = 0.61, AUC3 = 0.64, AUC5 = 0.63) (Fig. 6D). The 365 samples were randomly divided into a verification set and a training set at a ratio of 1:2. The validation set results are consistent with those of 365 samples, and the risk score can predict the prognosis great (Figs. 7A and 7B).
DNA methylation sites signature and the prognostic value of APLN
Univariate analysis was used to screen out the variables related to patient survival. The results showed that APLN expression, risk score, and tumor stage T all had significant effects on patient survival. (Figs. 8A and 8B). Factors with significant differences are used as Nomogram predictors. The results showed that the survival prediction is in good agreement with the actual observations (Fig. 8C).
Conventional treatment for HCC is highly invasive. And early prediction and personalized therapy of targeted immune checkpoints have research significance (Yin et al., 2019; Wang et al., 2021). APLN and DNAm are abnormally expressed in a variety of cancers. But the role of APLN in the immune microenvironment development of HCC is still unknown. APLN is up-regulated in HCC and HCC38/KMUH cells and may be used as a therapeutic target (Lin et al., 2012; Chen et al., 2019). APLN is a carcinogen that is highly expressed and its expression trend is increasing with HCC development. APLN, as a transcriptional target of the Wnt/β-catenin pathway, promotes HCC by activating PI3K/Akt signaling. The pharmacological targeting of APLN strongly inhibits the proliferation of HCC cells and tumor growth, which means that APLN may be a potential drug target for HCC. The gene has significant differences in the expression and clinical characteristics between the tumor and normal tissues. It has the potential prognostic value. The Cox regression model predicts the risk prediction characteristics significantly related to OS. Finally, the prognostic value of the risk prediction was verified by the nomogram.
The study determined that the proportion of total T cells in HCC cells is greater than in healthy tissues. And T cell subsets mainly include T cell CD4 memory resting, follicular helper T cells, etc., which impact the prognosis of HCC (Jia et al., 2015a; Rohr-Udilova et al., 2018; Garnelo et al., 2017). APLN affects three types of immune cells, including T cell CD4 memory resting, follicular helper T cells, and Tregs, highlighting the poor clinical correlation between T cell-mediated immunity and tumors (Li et al., 2021). Both PD-1 and CTLA-4 are related to some subsets of T cells, and mainly affect the development of tumors through PD-1/PD-L1 acting on T cells (Kim and Chen, 2016). PD-1 and PD-L1 interaction can resulting in inactivation of the immune response (Poureau and Metges, 2021). PD-1 is the main checkpoint in TME. Follicular helper T cells can inhibit the expression of Treg cells by affecting the expression of PD-1. And CTLA-4 can also enhance the immunosuppressive activity of Treg (a key mediator of CTLA-4) by regulating follicular helper T cells (Jia et al., 2015b; Hervouet, 1970; Yang et al., 2014; Topalian et al., 2016). The expression of APLN and specific immune cells play an essential role in the tumor progression of HCC (Wang et al., 2020a). APLN can regulate or induce the morphological and functional maturation of tumor blood vessels. And the recruitment of dendritic immune cells helps to enhance the effect of immunotherapy (Kidoya et al., 2012). Closely related to the HCC prognosis, the immune microenvironment is complex and usually heterogeneous with several genomic alterations, therefore making meaningful the integrated investigation of immune microenvironment with the DNA methylation sites of APLN in studies of HCC prognosis.
Not only does tumor-related macrophages promote thrombosis and endocytosis, angiogenesis, invasion and metastasis, but also it inhibits anti-tumor immune responses (Lin et al., 2019). Tumor-associated macrophages are mainly polarized into Macrophages M1 and Macrophages M2. The former plays an anti-tumor effect, while the latter has a tumor-promoting effect. Macrophages M2 mainly induces anti-inflammatory responses by producing IL-10 and (TGF)-β, meanwhile participating in the polarization of Th2 responses to promote tumor growth (Wildes et al., 2020; Biswas and Mantovani, 2010; Wolf-Dennen et al., 2020). Taking a higher percentage in HBV, non-polarized Macrophages M0 also affect the HCC prognosis (Ding et al., 2021; Huo et al., 2020). APLNs induce tumor vascular maturation and promote the infiltration of natural killer T cells into the central region of the tumor. This process enhances the efficacy of cancer dendritic cell-based immunotherapies (Mastrella et al., 2019). APLN inhibits and reduces neutrophil recruitment, thereby reducing the inflammatory and fibrotic response of pancreatitis. It also inhibits macrophage infiltration and monocytes, thereby enhancing the stable phenotype of atherosclerotic plaques (Ioannidis et al., 2010).
The signatures that DNA methylation sites analyzed by LASSO COX prevent data from overfitting. Studies found that the combination of selected central genes and DNAm conditions has an advantage in predicting the survival of HNSCC (Mou et al., 2020; Cao et al., 2020). The human m6A near the mRNA stop codon and the 3’ untranslated region will affect the transcription and function of genes through the promoter DNAm (Ondo et al., 2021). Usually, DNAm inhibits gene expression. APLN and the probe cg05688478 can affect gene expression in human pancreatic islets (Hall et al., 2014). The interaction of genome (gene expression) and epigenome (DNAm level) to control Apelin endothelium is one of the most critical ways for non-triple-negative breast cancer (Wu et al., 2020). DNA methylation silences the APLN expression, leading to increased insulin secretion in women (Hall et al., 2014). In APLN, rs3761581 G and rs2235312 T are relative to the degree of CpG island. Low levels of apelin-13 will cause greater methylation of CpG islands in the 5’UTR, which contributes to the pathophysiology of HA lung water. Abnormal methylation may prevent apelin from binding to hypoxia response elements, thereby inhibiting APLN transcription (Mishra et al., 2015). In downstream of DNA damage, the increased methylation of the APLN promoter may lead to reductions in protective signaling of the apeliner energy system. Then it lead to the pulmonary edema observed after exposure to oxidative air pollution (Miller et al., 2018).
The interaction between the genome and the epigenome will also change the evolution and control the development of different cancers. Epigenetic controlled immunotherapy has potential effects such as restoring anti-tumor signals (Gomez et al., 2020). Many sites in the DNAm region and epigenetic changes play an essential role in the early stage of liver cancer (Seal et al., 2020). The DNAm mechanism would regulate the immune cell response of cancer. Different DNAm patterns will also affect the characteristics of immune infiltration (Wu et al., 2020; Du et al., 2021). Epigenetics and gene expression influence each other (Mishra et al., 2015; Wang et al., 2020c). Studies have found that the CpG island of APLN has a more significant impact on lung tissue. The combined study of APLN and epigenetic DNA methylation sites are significantly related to survival. Research using epigenetics, gene transcriptome, and immune data, predicting prognostic markers of HCC has great potential.
The study still has limitations. The samples we studied are entirely retrospective. The data bias may affect survival and DNAm signatures. Although we determined the selected DNA methylation sites and APLN as predictive features, more experimental studies are still needed.
In summary, our research combines transcription, epigenetics, and immune data to predict HCC prognosis. The novel biomarkers immune-related APLN may be reliable prognostic markers for HCC.
Author Contribution: The authors confirm contribution to the paper as follows: Feifei Tian and Huan Hu wrote the manuscript and performed the statistically analyses; Di Wang and Huan Ding took part in data collection. Qingjia Chi and Huaping Liang prepared some figures and contributed to revise the manuscript; Wenli Zeng designed the work. All the authors read and approved the final manuscript.
Availability of Data and Materials: The datasets generated during the current study are available from the corresponding author on reasonable request.
Funding Statement: This work was supported by the Fund of Biosecurity Specialized Project of PLA (No. 19SWAQ18).
Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.
|This work is licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.|