HAND2-AS1, PRKAA2 and VLDLR predict the risk of peritoneal metastasis in gastric cancer of different Lauren types based on STEPP analysis

The peritoneum is the most common site of recurrence of gastric cancer (GC). Early occult peritoneal metastasis is difficult to detect by imaging examination. Stratifying the risk of peritoneal metastasis in patients with different Lauren subtypes is of great clinical value. We performed a univariate Cox regression to identify those genes with prognostic value of overall survival (OS) and peritoneal-specified disease-free survival (psDFS) from the Gene Expression Omnibus database. The candidate genes were screened by the Subpopulation Treatment Effect Pattern Plot (STEPP) method. Propensity score matching (PSM) analysis was used to reduce the interference of confounders on the results. Based on the optimal cut-off values determined by the STEPP method, we found overexpression of three genes (HAND2-AS1, PRKAA2, and VLDLR) was correlated with shorter 1-year psDFS among patients with diffuse-type than that of patients with intestinal-type GC, and it is highly significant. Gene Set Enrichment Analysis (GSEA) potentially suggested that the three genes promote the early occurrence of peritoneal metastasis in patients with diffuse-type GC through glucose metabolism-related pathways. These three genes may be potential biomarkers. They can be used to assess the risk of peritoneal metastases to guide treatment decisions and follow-up strategies.


Introduction
Gastric cancer (GC) is the fifth most common and lethal cancer in males and females globally and metastasis is its main cause of death (Arnold et al., 2020;Zhang et al., 2020b). Around 40% of GC patients have distant metastases at the time of diagnosis (Imaoka et al., 2016;Riihimäki et al., 2016). The peritoneum is the most common site of metastases and recurrences in patients with GC (Nishina et al., 2016;Sawaki et al., 2020). Currently, peritoneal metastasis is usually detected using imaging techniques such as computed tomography (CT), ultrasonography, and positron emission tomography/computed tomography (PET-CT) (Dong et al., 2019;Honma et al., 2018;Li et al., 2020b;Sawhney and Wilson, 2017). However, peritoneal metastasis may be undetectable in the early stage, and the high radioactivity and economic cost greatly limited its clinical application. Staging laparoscopy can make early diagnosis and tailor-made treatment of peritoneal metastasis, which is great progress in clinical diagnosis of early peritoneal metastasis (Rawicz-Pruszyński et al., 2019). Meanwhile, these examinations do not always provide reliable diagnoses or accurate prognostic predictions, therefore, accurate and less invasive predictive methods are urgently needed.
In GC patients, the patterns of recurrence varied significantly based on the Lauren subtype (Lee et al., 2018). The two Lauren subtypes have unique molecular mechanisms, clinical-pathological features, response to adjuvant chemotherapy, and prognostic risk factors (Lauren, 1965;Rawicz-Pruszyński et al., 2019;Schirren et al., 2020;Wang et al., 2020). Previous studies indicated that diffuse carcinoma is correlated with mutation of RHA and E-cadherin (Lazăr et al., 2008;Liu et al., 2006;Machado et al., 2001;Stănculescu et al., 2011;Zhang et al., 2020a). These mutations regulate cell-cell adhesion, allowing GC cells to invade adjacent structures without forming tubules or glands. Therefore, diffuse carcinoma is characterized by an increased risk of metastasis and worse survival. The abnormal expression of caudal type homeobox-2 (CDX-2) gene plays an important role in the development of GC, especially in intestinal-type (Almeida et al., 2005;Asano et al., 2016). Her-2 gene is significantly overexpressed in intestinal gastric carcinoma (Liu et al., 2012). Different molecular mechanisms may contribute to the greater susceptibility of diffuse-type GC to peritoneal metastasis, but there is no consistent conclusion as to its mechanism of action (Bao et al., 2019;Yu et al., 2019). Previous reports mostly predicted the risk of peritoneal metastasis in patients with GC based on genes and some recognised tumour markers (Jeon et al., 2014;Zhao et al., 2020). These biomarkers were screened mainly by differential expression analysis. Differential expression analysis could not prove that high and low expression occurred before and/or after peritoneal metastasis.
Subpopulation Treatment Effect Pattern Plot (STEPP) is a method used to ascertain the treatment-covariate interactions in terms of survival for continuous, binomial, and count data arising from two classifications. Besides, STEPP could define certain subpopulations of the patient based on gene expression, and it visualizes the classification effects estimated within each subpopulation (Baker and Bonetti, 2016;Zou et al., 2019). This enables STEPP to identify the differences in efficacy between subpopulations, thereby guiding clinical diagnosis and treatment.
The purpose of this work was to find the underlying genes and their potential mechanisms that affect the early development of peritoneal metastasis in diffuse-type GC. We used STEPP to compare the peritoneal metastasis between diffuse-type and intestinal-type GC. STEPP and Propensity Score Matching (PSM) analysis were used to improve the credibility of the results. In terms of clinical application, the risk of peritoneal metastasis could be accurately predicted based on the expression status of a few genes.

Patients
Microarray dataset GSE62254 was downloaded from the Gene Expression Omnibus (GEO) database (http://www.ncbi.nlm. nih.gov/geo/). The Lauren diffuse-or intestinal-type patients with integral clinical characteristics and survival data were included (Cristescu et al., 2015). The following are the inclusion criteria for this study: (1) the patient is older than 18 years; (2) histopathological confirmation of gastric cancer; (3) patients with Lauren classification information; (4) primary gastric cancer tumor specimens at the time of total or subtotal gastrectomy. The following are the exclusion criteria for this study: (1) excluding samples with censored survival data; (2) patients with mixed Lauren classification were excluded. Two hundred and eighty samples were finally included in this study. The RMA algorithm was performed for normalization in the R environment (v3.6.3) (Gautier et al., 2004).
Candidate gene identification and differentially expressed gene analysis Univariate Cox regression analysis was applied to identify the candidate genes with prognostic value. The Hazard ratios and false discovery rate (FDR) of all genes in the GSE62254 datasets were calculated under the univariate Cox regression. The OS-related genes were filtered by the criteria that the FDR < 0.01. Differentially expressed genes (DEGs) between patients whose first recurrence site was peritoneal seeding and other patients were screened with the thresholds of Pvalue < 0.05 using the "edgeR" package in R (Varet et al., 2016). Then, a Venn diagram was carried out to select the overlapping genes between the two subgroups, OS-related genes and DEG genes, to obtain candidate genes in common.
Univariate Cox regression was also performed on diseasefree survival (DFS), and the resulting overlapping genes with FDR < 0.01 were defined as related to the occurrence of peritoneal seeding. The OS was defined as the interval between the date of diagnosis and the date of death from any cause. The peritoneal-specified DFS (psDFS) was defined by the time between diagnosis and peritoneal recurrence.

STEPP analysis
To evaluate whether the peritoneal recurrence risk (in terms of one-year psDFS) difference between diffuse-and intestinaltype) varies according to the gene expressions, the screened candidate genes were analyzed by STEPP. As a graphical tool, the STEPP method was utilized to estimate the difference of peritoneal recurrence risk within each patient based on the continuous values of gene expressions with a sliding window approach (Kensler et al., 2019;Yip et al., 2016;Zou et al., 2019). We used STEPP to determine the cutoff values and divided patients into high-or low-risk groups for peritoneal recurrence in patients with diffuseand/or intestinal-type gastric cancer using R with package "STEPP". A sensitivity analysis was done to explore the pattern of change in results when the STEPP smoothing parameters (r 1 and r 2 ) change. The smoothing parameter r 2 , the minimum number of patients in the subpopulation, takes on 70, 120, or 170 patients out of a total of 280; we compute r 1 , the largest amounts of patients in common between two subpopulations, by considering the ratio of r 1 /r 2 to be 10%, 30%, 50%, 70%, and 90%. The number of subpopulations created also changed as values of r1 and r2 varied.
The patients were divided into low-expression and highexpression groups according to the cutoff values of gene expressions determined by STEPP. Kaplan-Meier was performed to show the relationships between gene expression levels and one-year psDFS, and the Log-rank test was used to analyze the differences between different gene expression groups.

Analysis of the correlation between genes
The analysis of correlation was conducted to investigate potential correlations such as upstream and downstream connections, synergistic effects, etc., among the candidate genes.

Propensity Score Matching (PSM) analysis
In order to reduce the influence of confounding factors, we performed a PSM analysis. The patients were adjusted using the PSM analysis with the nearest-neighbor matching method. PSM analysis created two groups of patients with similar numbers of diffuse-type and intestinal-type patients based on their baseline characteristics to minimize differences between baseline clinicopathological factors, which could be a confounding factor in evaluating the effect of Lauren classification (Casadaban et al., 2016;D'Agostino, 1998;Pilotto et al., 2018). The propensity score (PS) for each patient referred to the likelihood that the patient was assigned to a different Lauren classification, which was calculated by covariate adjustment based on clinicpathological information. By using a 1:1 nearest-neighbor matching method, we paired patients to the nearest PS within specified limits and produced two well-matched patient datasets. We used the newly matched patient datasets to validate the effect of gene expression levels in patients with diffuse-type gastric cancer. Kaplan-Meier curves were also conducted in the newly matched dataset.

GSEA
Gene Set Enrichment Analysis (GSEA) was conducted to explore the potential molecular mechanisms with the three candidate genes. A reference gene set contained the gene sets related to glucose metabolism and lipid metabolism on the Molecular Signatures Database (MSigDB). All steps were performed by the GSEA JAVA program (https://www.gseamsigdb.org/gsea/index.jsp) according to the Pearson method, conditional on the number of analytical substitutions of 1000. The normalized enrichment score (NES) was the primary statistic for examined gene set enrichment results. P < 0.05 and false discovery rate (FDR) < 0.25 were considered statistically significant (Li et al., 2016a;Zhu and Dong, 2018). Fig. 1 showed the main steps in the study. Using the GSE62254 dataset, 280 patients with Lauren diffuse-or intestinal-type were screened by criteria containing integral clinical characteristics and survival data, including 135 patients with diffuse-type GC and 145 patients with intestinal-type GC. The clinical information pertaining to the dataset is summarised in Table 1. Two thousand seven hundred and thirty-two OS-related genes were selected from all genes included in GSE62254 by univariate Cox regression analysis (FDR < 0.01). A total of 155 differentially expressed genes (DEGs) associated with the first recurrence of peritoneal seeding were screened under the criterion whereby P < 0.05. Then, these two sets of genes were taken to intersect, resulting in a total of 142 genes (Suppl. Fig. S1). Univariate Cox regression revealed that these 142 genes were associated with the recurrence of the peritoneum as the first metastasis sites (FDR < 0.05).

STEPP analysis
To classify patients into high/low-risk groups for diffuse-type GC, we used STEPP to determine the cut-off values for gene expression values. The 142 genes (mentioned above) were analysed by STEPP with different parameters (r 1 , r 2 ), and the appropriate parameters were determined according to previous reports and actual analytical results (Yip et al., 2016). Based on the STEPP analysis results, three significantly impressive genes (HAND2-AS1, PRKAA2, and VLDLR) were selected (Suppl. Table S1, Fig. 2).
The STEPP analysis showed a trend towards significant interaction according to Lauren type (diffuse-type vs intestinal-type of GC) in terms of 1-year psDFS when increasing gene expression. Indeed, with a high value of gene expression, patients with diffuse-type cancer exhibited poorer performance than those with intestinal-type cancer in terms of 1-year psDFS. Therefore, we classified the patients into high-risk and low-risk groups based on the established cut-off value. Based on the established cut-off values of the three genes, 280 patients were divided into high-risk and low-risk groups separately. In the high-risk group, the 1-year psDFS for patients with the diffuse-type of cancer was significantly shorter than that of patients with intestinal-type cancer. In the low-risk group, there was no significant difference between patients with different Lauren types (Fig. 3).

Analysis of the correlation between genes
In order to reveal the association of these three genes, we paired them pairwise for correlation analysis. The results showed that the correlation of any two of these three genes was statistically significant, but the lower correlation coefficient indicated that their degree of association was not particularly strong (P < 0.001, r < 0.5, Suppl. Table S2). Therefore, we hypothesised that these three genes might not be associated with upstream and downstream effects but jointly promote peritoneal metastasis in GC.

PSM analysis
We obtained 184 patients as the new dataset after PSM analysis. There was no statistical difference in the clinicopathological characters between patients with two different Lauren types (Fig. 4a, Table 2). In the new dataset, the 1-year psDFS of patients with diffuse-type GC in the high-risk group was still significantly shorter than that of patients with intestinal-type. In addition, in the low-risk group, there was no significant difference of 1-year psDFS between patients with the two Lauren types (Figs. 4b-4d).

GSEA
To identify signalling pathways differentially affected by the three gene expression levels, GSEA was conducted based on mRNA expressions in the GSE62254 cohort. GSEA results revealed that high expressions of HAND2-AS1, PRKAA2, and VLDLR were associated with cellular glucose homeostasis, glucose-6-phosphate metabolic process, glucose transmembrane transport, cellular glucose metabolic process, and reactive glucose metabolism (FDR < 0.05, P < 0.05; Fig. 5). We selected the most significantly enriched signalling pathways based on the normalised enrichment score (NES). These results confirmed that the three genes promote the early occurrence of peritoneal metastasis in patients with diffuse-type GC through glucose metabolism-related pathways.

Discussion
Lauren histologic type is a significant factor associated with peritoneal recurrence. Patients with diffuse-type GC are at higher risk of peritoneal metastasis (Dong et al., 2019;Lee et al., 2018;Perrot-Applanat et al., 2019;Stănculescu et al., 2011). We aimed to screen patients with a higher shortterm risk of peritoneal metastasis using the STEPP method. We used STEPP to demonstrate the effects of gene expression levels on the early development of peritoneal metastasis in GC. Besides, through PSM, we attempted to eliminate the interference of confounding factors and ensure the reliability of the results. We found three genes that may promote the early development of peritoneal metastasis in diffuse-type GC and envisaged the mechanism thereof.
The exact molecular mechanisms contributing to the different susceptibility to peritoneal metastasis between Lauren subtypes remains unclear (Bao et al., 2019;Yu et al., 2019). LncRNA HAND2-AS1 was shown to play a tumoursuppressive role in many cancers, such as osteosarcoma, colorectal cancer, lung cancer, leukaemia, oesophageal cancer, endometrial cancer, high-grade serous ovarian carcinoma, and ovarian cancer (Gokulnath et al., 2020;Shi et al., 2020). HAND2-AS1 has been infrequently reported in GC, and the molecular function among different Lauren types remains unclear (Li et al., 2020a). In gastric adenocarcinoma cells, HAND2-AS1 could act as a tumoursuppressive factor through inhibiting cell proliferation, migration, and invasion abilities (Xu et al., 2020;Yu et al., 2020). Most scholars believe that adenocarcinoma types in the WHO classification were mostly classified as intestinaltype in the Lauren classification. Kawamura et al. (2001) find a difference in the level of expression of glucose transporter 1 (GLUT1) between diffuse-type GC and intestinal-type GC (Kawamura et al., 2001). Chen et al. find that HAND2-AS1 may inhibit the proliferation of osteosarcoma cells by regulating GLUT1 expression (Chen et al., 2019). STK11-PRKAA2-ULK1 and this signalling pathway are also involved in increased migration and cell survival in gastric adenocarcinoma cells (Rao et al., 2017). PRKAA2 is the gene that encodes the α subunit of AMPK. AMPK was found to suppress glucose metabolism, enhance apoptosis, and reduce cell proliferation in GC cells as a tumour-suppressive factor (Chang et al., 2016;Li et al., 2013;Li et al., 2018;Li et al., 2016b). Besides, alterations in AMPK can affect VLDLR expression (Zenimaru et al., 2008). VLDLR increases epithelial proliferation and maintains angiogenesis (Oganesian et al., 2008;Rebustini et al., 2012). According to the results of this study and previous reports, we speculated that the three genes, namely HAND2-AS1, PRKAA2, and VLDLR, may lead to early peritoneal metastasis in patients with diffuse-type GC by modulating glucose-lipid metabolism.
In clinical practice, after Lauren classification, the risk of peritoneal metastasis within 1 year could be predicted based on the immunohistochemical results of the three genes. Additionally, for diffuse-type patients with positive expression, the examination of peritoneal metastasis should be improved, and clinicians could thus develop personalised treatment and postoperative follow-up strategies for patients.
Lack of validation of the three gene expressions in pathological specimens represents a key limitation of our study in that it allows the only generation of a hypothesis concerning the role of peritoneal metastasis of the three genes in diffuse-type GC. Based on previous literature and the results presented here, we envisage that these three genes may mediate the early development of peritoneal metastasis in diffuse-type GC through glucose metabolism-related pathways. These results open further perspectives and deserve to be confirmed in further studies. Further experimental validation is needed to explore the specific roles of these three genes. In addition, the low incidence of peritoneal metastasis and the small sample size based on online data are other limitations of this study. We look forward to further verifying our results with larger sample size. Despite these limitations, we believe the results are robust and can be extended to a larger patient population.
The prognosis of these three genes was significantly different between the high and low-expression groups, and these differences were statistically significant. These three genes can be used as biomarkers to predict peritoneal metastasis and guide the choice of chemotherapy regimen by predicting the risk of peritoneal metastasis. This is of great clinical significance.    (d) between high (left side) and low (right side) expression levels and 1-year psDFS measured by STEPP in diffuse-type and intestinal-type GC patients after propensity score matching P < 0.05 was considered statistically significant.