Prognostic tumor microenvironment gene and the relationship with immune infiltration characteristics in metastatic breast cancer
1Department of Medical Oncology, The First Affiliated Hospital of Hainan Medical University, Haikou, 570102, China
2Clinical Laboratory Center, The First Affiliated Hospital of Anhui University of Chinese Medicine, Hefei, 230031, China
3Department of Clinical Laboratory, The First Affiliated Hospital of Xinjiang Medical University, Urumqi, 830054, China
*Address correspondence to: Changcheng Yang, firstname.lastname@example.org; Yanda Lu, email@example.com
#These authors contributed equally to this work
Received: 07 July 2021; Accepted: 15 September 2021
Abstract: The aim of this study was to reveal genes associated with breast cancer metastasis, to investigate their intrinsic relationship with immune cell infiltration in the tumor microenvironment, and to screen for prognostic biomarkers. Gene expression data of breast cancer patients and their metastases were downloaded from the GEO, TCGA database. R language package was used to screen for differentially expressed genes, enrichment analysis of genes, PPI network construction, and also to elucidate key genes for diagnostic and prognostic survival. Spearman’s r correlation was used to analyze the correlation between key genes and infiltrating immune cells. We screened 25 hub genes, FN1, CLEC5A, ATP8B4, TLR7, LY86, PTGER3 and other genes were differentially expressed in cancer and paraneoplastic tissues. However, patients with higher expression of CD1C, IL-18 breast cancer had a better prognosis in the 10 years survival period, while patients with high expression of FN1, EIF4EBP1 tumors had a worse prognosis. In addition, TP53 and HIF1 genes are closely related to the signaling pathway of breast cancer metastasis. In this study, gene expression of ATP8B4 and CD1C were correlated with cancer tissue infiltration of CD8+ T lymphocytes, while GSE43816, GSE62327 and TCGA databases showed that CD8+ T lymphocytes were closely associated with breast cancer progression. Functional enrichment analysis of genes based on expression differences yielded key genes of prognostic value in the breast cancer microenvironment.
Keywords: Tumor microenvironment; Immune infiltration; Prognostic biomarker; Metastatic breast cancer
Breast cancer is the most common malignancy in women worldwide and the most common cancer surpassed lung cancer (Sung et al., 2021). Currently, with the increasing life expectancy of breast cancer and the rapid development of tumor diagnostic technology, breast cancer can be detected early, which lead to its increasing incidence (Sung et al., 2021). In addition, great progress has been made in the comprehensive treatment of breast cancer, such as targeted anti-HER2 therapy and endocrine therapy, but the prognosis of patients is still not very optimistic (Qiu et al., 2021; Yuan et al., 2020). Since metastasis of breast cancer seriously affects the prognosis of patients, we still do not fully understand the underlying molecular mechanisms of its development. Therefore, it is of great importance to explore the underlying molecular mechanisms of breast cancer development by identifying new diagnostic and prognostic biomarkers and potential therapeutic targets.
Currently, numerous studies have confirmed that tumorigenesis, progression and prognosis are closely related to tumor tissue gene expression (Li et al., 2021b; Lin et al., 2017; Zhang et al., 2019). The genetic expression features of the tumor microenvironment (TME) play an important impact in predicting the clinical outcome of tumor treatment (Llanos et al., 2021; Yang et al., 2021). The TME is mainly composed of cells and extracellular matrix. The cells in TME include lymphocytes, tumor-associated macrophages, tumor-associated fibroblasts, mesenchymal cells, endothelial cells, and other cells in addition to tumor cells. Immune cells and stromal cells are the two main non-tumor cells that are important for tumor diagnosis and assessment of tumor prognosis. Moreover, studies had found that altered gene expression in tumor tissue might lead to changes in non-tumor cell infiltration. For example, SFRP4 positively correlates with infiltration of FOXP3+ Treg cells in hepatocellular carcinoma, and downregulation of SFRP4 in tumor cells affects T cell recruitment (Yang et al., 2019b). It has also been found that high expression of PRC1 was associated with low macrophage infiltration in HCC tissues, which in turn lead to poor prognosis (Zhou et al., 2020a). Studies have reported genes closely related to breast cancer such as BRCA1, BRCA2, ATM, CHEK2, etc. (Dorling et al., 2021). These genes can be used not only for cancer diagnosis, but also seriously affect the development, metastasis, and treatment outcome of breast cancer. Studies have confirmed that BRCA gene mutations may not directly drive immune infiltration, but the number of infiltrating lymphocytes and monocytes is significantly higher in breast cancer tissues with decreased BRCA expression (Kubli et al., 2019; Solinas et al., 2019). However, understanding of the key genes and their associated immune cell infiltration significantly associated with breast cancer prognosis is still very limited.
In this study, we first obtain differentially expressed gene signatures from the Gene Expression Omnibus (GSE) datasets and The Cancer Genome Atlas (TCGA) to evaluate genes associated with breast cancer metastasis. Then, functional analyses such as diagnosis and prediction of prognosis of disease were performed for important genes. In addition, we evaluate the relationship between relevant genes and infiltrating immune cells to provide an immunological basis for the tumor microenvironment and to reveal potential immunotherapeutic approaches.
Materials and Methods
Breast cancer datasets acquisition and processing
The workflow of analysis in our study is illustrated in Fig. S1. Gene expression datasets and corresponding clinical information of breast cancer patients were collected from the dataset of GEO (Barrett et al., 2013) and the TCGA dataset included 1124 patients, and those microarray data without clinical information were discarded. In our study, 2 GEO datasets contained GSE62327 (Triulzi et al., 2015) (7 cases in the pre-radiation group and 7 cases in the post-radiation group) and GSE43816 (Gruosso et al., 2016) (14 cases in the disease group and 10 cases in the control group) were downloaded (https://www.ncbi.nlm.nih.gov/geo/) (Tables S1–S3). The primary data for the GSE62327 generated by Affymetrix Human Gene 1.1 ST Array [transcript (gene) version]). And the primary data for the GSE43816 generated by Illumina HumanHT-12 WG-DASL V4.0 R2 expression beadchip. The Cancer Genome Atlas (TCGA) data referenced in this study were available in a public repository from TCGA website (https://portal.gdc.cancer.gov/).
Differential gene expression analysis
The differential expression mRNAs were analyzed according to breast cancer and normal groups using the limma package (Ritchie et al., 2015). Log fold change (logFC) > 1.0 and adjusted P-value < 0.05 were set as the threshold for differential genes. The results of differential analysis of mRNA will be presented using volcano plot (ggplot2) and heat map (pheatmap) (https://CRAN.R-project.org/package=pheatmap). Where the volcano plot is plotted with ggplot2 and the heatmap is plotted with the pheatmap package.
The metascape (http://metascape.org) (Zhou et al., 2019) database was used to analyze the gene ontology (GO) enrichment analysis of the list of differentially expressed mRNA target genes after intersection, mainly including biological processes (BP), cellular component (CC), molecular function (MF); the parameters were set P < 0.01, minimum count >3, enrichment factor >1.5. The GSEA method (Subramanian et al., 2005) enriched differential mRNAs for expression-related pathways, and the groups were aligned 10,000 times for each analysis. The KEGG pathways dataset from curated gene sets was selected as the reference set. The threshold for statistically significant GSEA analysis was set at a corrected P-value < 0.05 and FDR < 0.25. The results of the enrichment analysis will be characterized by the corrected P-value as well as the NES. GSEA enrichment analysis and visualization were performed using GSEA native software.
PPI network construction
The STRING database (http://string-db.org, version 11) (Szklarczyk et al., 2019) was used to construct the protein–protein interaction network. All the differential mRNA intersections after the associated differential molecules were put into the database for analysis, and the interaction threshold was set to 0.4. Cytoscape software (Shannon et al., 2003) was used to visualize the molecular interaction network. a plug-in for Cytoscape software: CytoHubba (Chin et al., 2014) was used to analyze the hub gene in the network.
Immune infiltration analysis
We extracted marker gene of 24 immune cells and analyzed the infiltration of 24 immune cells within the tumor using the EPIC method (Racle et al., 2017), and analyzed the degree of correlation of the HUB gene and the expression matrix with these 24 cells using the Spearman correlation method.
Most of the statistical analyses in this study were done by the bioinformatics tools mentioned above. Other statistical analyses such as ROC curve analysis, survival curve analysis, etc. were performed by the R language package with version 3.4.1. The different expression levels of the genes were analyzed by two-tailed Student’s t-test and the GO terms and KEGG signaling pathway were done by Fisher’s test. Spearman’s r correlation analysis was used to analyze the correlation of non-normally distributed data. P-values less than 0.05 were considered statistically significant.
Screening of differentially gene expression changes in breast cancer
According to appropriate criteria in GEO database, two genome-wide gene expression datasets (GSE43816 and GSE62327) on the chemotherapy of breast cancer were selected and analyzed using the R project limma package with a threshold of adjust P-value < 0.01 and |log2 FC| > 3.0. In GSE43816 datasets, a total of 5190 mRNAs were screened out, of which 3166 up-regulated mRNA and 2024 down-regulated mRNA were identified (Figs. 1A and 1B). In GSE62327 datasets, there were 2938 up-regulated mRNA and 703 down-regulated mRNA in the comparison of chemotherapy versus non-chemotherapy of breast cancer (Figs. 1C and 1D). To clarify the genetic changes between metastatic and non-metastatic breast cancer, we compared TCGA data to make new discoveries. As shown in the volcano plot, the data from TCGA were analyzed and 7566 differentially expressed mRNAs (including 3759 were upregulated and 3807 were downregulated) were identified (Figs. 1E and 1F). When we intersected the GEO chemotherapy dataset and the dataset of breast cancer survival status, there were 81 differentially expressed genes that were upregulated (Fig. 2A). When we take the dataset of TCGA chemotherapy and metastasis and the GSE dataset to intersect, there were totally 74 upregulated mRNAs that were commonly found in four datasets (Fig. 2B).
Functional of pathway enrichment analysis of differential genes
In order to explore the biological functions of different genes, we analyzed and visualized the enrichment of 79 differential genes in the GO pathway (Figs. 3A–3C, Table S4); the enrichment results were mainly presented in the three directions of BP, MF, and CC of the GO pathway; the enrichment results revealed that the differential genes were mainly involved in biological processes such as reproductive structure development, epithelial to mesenchymal transition, cellular aldehyde metabolic process and extracellular matrix disassembly. We also found that it is closely related to the hypodifferentiation process: epithelial cell differentiation, which occurs mainly in the cellular matrix: extracellular matrix and is closely related to molecular functions such as ion binding. In summary, differential genes are mainly involved in the metastatic process and can promote epithelial to mesenchymal transition of cells. The results of the KEGG pathway enrichment analysis of differential genes show that differential genes are mainly associated with calcium signaling pathway, circadian entrainment, inflammatory mediator regulation of trp channels, pathways in cancer, neuroactive ligand-receptor interaction, and other biological processes (Figs. 3D–3F, Table S5).
Enrichment analysis of grouped GSEA based on tumor metastasis
Based on the metastasis or not and grouping of TCGA samples, we obtained positive correlation pathway enrichment (Figs. 4A–4C, Table S6) and found that the grouping of high metastasis was more likely to be involved in biological processes of cell metastasis-related and chemotherapy-tolerant signaling pathways such as cell development, cell differentiation, regulation of cilium movement, cilium movement, response to iron ion, response to X-ray, myelin assembly. Similarly, we also obtained the negative pathway enrichment pathway (Figs. 4D–4F, Table S6) and found that the low metastasis subgroup was more likely to participate in natural killer cell mediated immunity, leukotriene biosynthetic process, positive regulation of BMP signaling pathway, response to oxygen radical leukotriene metabolic process, monocyte differentiation and other biological processes.
PPI network interactions and top HUB genes
In order to understand the interaction between different genes in the three database intersections, we obtained the interaction network map of tumor metastasis-related genes by PPI networks (Fig. 5A). We then performed the identification of the core molecules of the network graph (Fig. 5B) and finally obtained the top 25 HUB genes (Figs. 5B–5D, Table S7). These 25 HUB genes are IL-18, CD1C, PTGER3, TP53, TSC2, STK11, NPY, BCL1, TLR7, CD80, PRTOR, C3AR1, MAPK, FN1, LY86, CCL2, BAX, BID, TNFSF10, FADD, CLEC5A, PARP1, ATP8B4, TMEM30A, EIF4EBP1. Then we selected these hub different genes for following analysis.
Analysis of expression levels based on key genes and survival analysis in TCGA breast cancers
To further confirm the expression of hub genes in breast cancer, we examined the expression of 25 hub genes in TCGA database and subjected these genes to correlation analysis of TCGA expression in breast cancer, we found that in the screened differential genes TLR7, LY86, CLEC5A, CXCR4, PARP1, FN1 were expressed at lower levels in breast cancer cancer than in the paracancer (P < 0.05, Figs. 6I, 6K, 6O, 6Q, 6R, and 6S). On the other hand, PTGER3, ATP8B4 was expressed at higher levels in cancer than in the paracancer (P < 0.05, Figs. 6N and 6X), and then we subjected the related genes to survival analysis to determine the prognosis. Overall survival (OS) was performed to obtain risk coefficients and determine the prognosis for the relevant target genes. During the 10 years survival period, we found that lower expression of FN1, EIF4EBP1, TP53 had lower risk coefficients and were positively associated with good prognosis (P < 0.05, Figs. 7B, 7C and 7Q). However, high expression of CD1C, PTGER3, IL18 gene had lower risk factors and longer overall survival (P < 0.05, Figs. 7A, 7I, and 7L).
Genes with differential expression levels after ROC diagnostic prediction analysis
In order to further clarify the clinical diagnostic value of genes with significant differential expression, we subjected these genes to ROC predictive diagnostic analysis and found that FN1, LY86, IL-18, CD1C, C3AR1, TLR7, and TMEM30A genes were genetic indicators with significant predictive diagnoses, and their diagnostic ROC curves had areas under the curves (AUC) of 0.968, 0.966, 0.897, 0.874, 0.846, 0.812, and 0.781, respectively (Figs. 8A and 8B).
Screening of key transcription factors for key genes and analysis of interplay networks and functions
In order to further screen the target transcription factors of key genes associated with metastasis, in this study, we took two key genes, TP53 and HIF1A, as examples, and obtained the transcription factors and their interplay networks that had binding to multiple differential genes through network interactions (Fig. 9A). Based on the network interactions between TP53 and HIF1A and the functional analysis of the pathway (Figs. 9B and 9C), we found that TP53 and HIF1A are mainly involved in cell-cell adhesion, focal adhesion, cell adhesion molecules cams and cell cycle. In addition, the GSEA pathway enrichment analysis of the target gene TP53 and HIF1A interaction network revealed that these two classical genes were mainly enriched to cell cycle, cell adhesion molecules after low expression (Figs. 9D and 9E).
Risk prediction of key metastasis-associated genes
To further identify risk factors associated with disease progression, we performed univariate and multivariate COX regression analysis and created forest plots for prediction and analysis based on metastasis-related genes and clinicopathological data, and found that Event (non-metastatic breast cancer had a low risk factor = 0.096, P < 0.001) (Fig. 10A), CD1C expression for predicting breast cancer recurrence and metastasis and OS (risk factor = 1.08719, P = 0.01392) (Fig. 10A), BCL1 expression for breast cancer recurrence and metastasis and OS (risk factor 1.129, P = 0.018) (Fig. 10A). However, although the risk coefficients of genes such as TLR7, CCR1, HRH4, HTR1D, P2RY13, FN1 were high, these genes were not statistically significant for predicting breast cancer (P > 0.05, Figs. 10B–10E).
Analysis of immune infiltration assessment of key genes
Finally, we performed immune infiltration assessment analysis of the screened related genes in the breast cancer TCGA database to assess their correlation analysis with CD8+ T cells. In the immune infiltration assessment analysis of this study, we found that ATP8B4 showed a negative correlation with immune CD8+ T-cell infiltration with an R-value of −0.134 (P < 0.01, Fig. 11A). In addition, FN1 also showed a significant negative correlation with the proportion of immune CD8+ T-cell infiltration, with an R-value = −0.368 (P < 0.01, Fig. 11E). However, as shown in the figure, TLR7, LY86, IL18, and CD1C showed positive correlation with the infiltration proportion of CD8+ T cells with R-values of 0.347, 0.257, 0.451, and 0.253, respectively, and their P-values were all less than 0.01 (Figs. 11B–11D and 11F).
Evaluation of immune cell infiltration from TCGA, GEO sample data
To further clarify the immune status in breast cancer tissues, samples from TCGA and GEO were evaluated for immune cell infiltration and mesenchymal cell infiltration, resulting in the infiltration of immune cells CD4+, CD8+ T cells and other mesenchymal cells in the relevant datasets, as well as the proportion of each immune cell. As shown by the immune infiltration levels of metastatic and non-metastatic samples of breast cancer in the TCGA dataset, cancer-associated fibroblasts, vascular endothelial cells, and CD8+ T cells are strongly associated with breast cancer metastasis (Figs. 12A–12C); In addition, the levels of CD8+ T cells, B cells, and NK cells were higher in the group of chemotherapy without metastasis than in the group of chemotherapy with metastasis and the group of non-chemotherapy with metastasis, while the levels of CD8+ T cells, NK cells in the levels in the group of non-chemotherapy without metastasis were higher than those in the group of non-chemotherapy with metastasis (Fig. S2A). The immune infiltration levels of the GSE43816 dataset suggest that vascular endothelial cells, CD4+ T cells, CD8+ T cells, cancer-associated fibroblasts are closely related to breast cancer progression (Figs. 12D–12F). Moreover, the expression levels of CD8+ T cells, B cells, CD4+ T cells, and NK cells were higher in the chemotherapy group than in the non-chemotherapy group (Fig. S2B). The GSE62327 dataset on the level of immune infiltration in breast cancer tissues suggested that cancer-associated fibroblasts, vascular endothelial cells, and CD8+ T cells were strongly associated with breast cancer progression (Figs. 12G–12I). Furthermore, the expression levels of CD8+ T cells, B cells, and NK cells were higher in the chemotherapy-sensitive group of breast cancer than in the chemoresistant group (Fig. S2C).
With the rapid development of genetic testing technologies (e.g., microarray technology, sequencing technology, etc.) and the establishment of genetic databases, genes associated with cancer are increasingly analyzed and validated as new targets (Wang et al., 2016). However, most of the current studies do not correlate oncogenes with immune cells infiltrated in TME, which may affect the immunotherapeutic response to cancer. In this study, we sought to explore the gene expression associated with chemotherapy and metastasis of breast cancer, to obtain important genes with prognostic and diagnostic value, and to analyze their associated signaling pathways to understand the progression of breast cancer patients. By comparing the transcriptional expression profiles of breast cancer patients and metastasis groups, a total of 81 genes involved in chemotherapy-related breast cancer gene expression and 79 genes associated with breast cancer metastasis were identified. These genes were mainly involved in reproductive structure development, epithelial to mesenchymal transition, cellular aldehyde metabolic process, extracellular matrix disassembly, epithelial cell differentiation and other biological processes. Moreover, the main genes associated with metastasis were CXCR4, ERBB2, EIF4EBP1, TSC2, MAPK1, CCL2, RPTOR, BCL2, STK11, TP53, PARP1, FADD, TNFSF10, BID, BAX and so on. However, FN1, CLEC5A, ATP8B4, TLR7, LY86 PTGER3 expression significantly predicted the overall survival of breast cancer patients. Subsequently, the expression of genes including CD1C, TLR7, FN1, IL18, C3AR1, LY86 were included in the ROC curve analysis of breast cancer diagnosis and the associated risk prediction analysis. Importantly, ATP8B4 and FN1 expression were significantly associated with CD8+ T immune cell infiltration by Spearman’s correlation analysis.
Chemotherapy is still the main strategy for the treatment of cancer. However, studies have confirmed that chemotherapy may promote cancer metastasis and tumor metastasis is an important factor contributing to poor tumor prognosis (Karagiannis et al., 2017; Keklikoglou et al., 2019). In this study, by analyzing GSE43816, GSE62327, and TCGA datasets, we identified some differential genes associated with breast cancer metastasis, and the enrichment results showed that these genes are closely related to cell stromal remodeling, epithelial mesenchymal transition, intracellular metabolism, and hypofractionation process of cells. Several studies have confirmed that EMT not only confers the ability of tumor cells to invade and metastasize, but also is a major factor affecting the efficacy of chemotherapy (Li and Kang, 2016). Furthermore, epithelial mesenchymal transformation affect changes in internal cellular metabolism, involving cytoskeletal alterations and matrix remodeling, and further affects cell differentiation (Ijaz et al., 2014; Tsubakihara and Moustakas, 2018). Currently, the main signaling pathways associated with EMT are TGF-β signaling pathway, Wnt signaling pathway and Notch signaling pathway (Kyuno et al., 2021). In the present study, the main signaling pathways involved in the differential genes were Calcium signaling pathway, NOD-like receptor signaling pathway and mTOR signaling pathway, which were also found to affect EMT directly or indirectly in previous studies (Figiel et al., 2019; Hu et al., 2018; Luo et al., 2021). Our study also found that high metastasis of breast cancer is closely related to biological processes such as cell development, cell differentiation, and regulation of ciliary motility, while in low metastasis natural killer cell-mediated immunity, leukotriene biosynthesis process, positive regulation of BMP signaling pathway, leukotriene metabolic process in response to oxygen free radicals, and monocyte differentiation. Thus, by searching for key genes associated with altered metastatic ability after chemotherapy resistance in breast cancer, these genes can be studied not only as targets for future new drugs, but also as relevant biomarkers to reflect the effect of cancer chemotherapy. It is important to note that additional prospective studies are needed as well to further validate these findings due to sample size limitations in the dataset.
Next, we performed overall survival analysis and ROC curve analysis on some of the screened HUB genes. Firstly, we found that the expression levels of FN1, CLEC5A, ATP8B4, TLR7, and LY86 genes were higher in breast cancer tissues than in paracancerous tissues, while the expression levels of PTGER3 genes were lower in breast cancer tissues than in paracancerous tissues. Consistent with previous studies, in addition to ATP8B4 and LY86, FN1, CLEC5A and TLR7 were found to be highly expressed in breast cancer cells or tissues by clinical trial or bioinformatics studies, but PTGER3 expression was reduced in breast cancer, and the expression levels of these genes affects the prognosis of breast cancer (Bao et al., 2019; Semmlinger et al., 2018; Shi et al., 2020; Zhang et al., 2021; Zhang et al., 2020). Moreover, these genes play an important role in the progression of several other cancers, for example, FN1 and CLEC5A have been reported to promote proliferation, invasion and metastasis of gastric cancer and glioblastoma (Fan et al., 2019; Han et al., 2020; Wang et al., 2020; Yang et al., 2019a). In line with other findings (Inoue et al., 2019; Milioli et al., 2017; Saatci et al., 2020), we also found a significant positive correlation between low expression of IL18 genes and good prognosis of breast cancer through survival curves, while high expression of FN1 and C3AR1 genes implied poor disease prognosis. Finally, we verified that some genes (e.g., FN1, LY86, IL-18, CD1C, C3AR1, TLR7, and TMEM30A) have good diagnostic efficacy in predicting the development of breast cancer by ROC curves, and all of these genes may be used as potential biomarkers.
To further confirm the relationship between key genes and breast cancer metastasis, in the present study TP53 and HIF1A genes were used as examples, and we found that TP53 and HIF1A are mainly involved in the biology of cell-cell adhesion, cell adhesion molecules cams and cell cycle processes. Many studies have confirmed that TP53 and HIF1A genes are closely related to the metastasis of tumors (Li et al., 2021a; Tang et al., 2021; Tiwari et al., 2020). In our study, CD1C and ATP8B4 were closely associated with recurrence, metastasis, and prognosis of breast cancer. Curiously, we found that the ATP8B4 gene was negatively correlated with CD8+ T lymphocyte infiltration in breast cancer tissues, whereas CD1C was positively correlated with CD8+ T lymphocyte infiltration in cancer tissues. In addition, we also found that genes such as TLR7, LY86, and IL18 were positively correlated with the proportion of CD8+ T lymphocytes infiltrating in cancerous tissues. Recent studies have shown that IL-18 and its receptor can activate specific high expression of CD8+ T cells, suggesting a correlation between IL-18 not only with CD8+ T cells, but also its pathway to tumor-killing effector cell function (Zhou et al., 2020b). And the results by all three databases, GSE43816, GSE62327, and TCGA, showed a close relationship between CD8+ T cells and the progression of breast cancer. However, whether CD1C and ATP8B4 are abundantly expressed in many malignancies, are associated with migration of regulatory T cells, and their role in tumor-associated immunosuppression still need further experimental study.
In our study, we extracted genes of prognostic value from the GEO and TCGA databases based on functional enrichment analysis of breast cancer metastasis. These genes can be used to predict the occurrence and prognosis of breast cancer patients and show their potential to be biomarkers for breast cancer diagnosis as well as prognosis. In addition, further study of the relevant genes will help to understand the immune status of the tumor microenvironment and explore potential therapeutic targets. However, there are still some limitations of this study. First, in order to provide a comprehensive picture of the factors and effects affecting the microenvironmental phenotype of breast cancer, additional clinical features of breast cancer patients should be included in subgroup analysis. Second, the small sample size in the dataset may lead to biased results or high false positive rates, and the sample size will be appropriately increased in future studies on the immune microenvironment of breast cancer. Third, the data obtained such as altered genes and associated enriched signaling pathways were retrieved from public databases, and experiments are needed to validate these findings. Fourth, up- and down-regulation of genes may not be the real cause of the altered tumor microenvironment (e.g., lymphocyte infiltration) and breast cancer metastasis, so further mechanistic studies of these genes are needed.
Acknowledgement: We thank the TCGA and GEO databases for providing the platform and contributors for uploading their meaningful datasets.
Availability of Data and Materials: The datasets analyzed in the current study are available in the GEO (https://www.ncbi.nlm.nih.gov/geo/) and TCGA database (https://portal.gdc.cancer. gov/).
Ethics Approval: TCGA and GEO both are public databases. The patients involved in the database have obtained ethical approval in the original studies. Users can download relevant data for research and publish relevant articles. Our study is based on these open source data, so there are no ethical issues.
Authors’ Contribution: The authors confirm contribution to the paper as follows: study conception and design: Changcheng Yang and Yanda Lu; data collection: Lu Yang, Fen Huang, Yang Wen and Boke Zhang; analysis and interpretation of results: Lu Yang, Yun Liu, Boke Zhang, Jiangzheng Zeng and Mengsi Yu; draft manuscript preparation: Lu Yang, Boke Zhang and Changcheng Yang. All authors reviewed the results and approved the final version of the manuscript.
Funding Statement: This work was supported by Hainan Provincial Natural Science Foundation of China (No. 820RC765).
Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.
|This work is licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.|