Identification of immunotherapy-related subtypes, characterization of tumor microenvironment infiltration, and development of a prognostic signature in gastric carcinoma

Background: Recent advances in immunotherapy have elicited a considerable amount of attention as viable therapeutic options for several cancer types, the present study aimed to explore the immunotherapy-related genes (IRGs) and develop a prognostic risk signature in gastric carcinoma (GC) based on these genes. Methods: IRGs were identified by comparing immunotherapy responders and non-responders in GC. Then, GC patients were divided into distinct subtypes by unsupervised clustering method based on IRGs, and the differences in immune characteristics and prognostic stratification between these subtypes were analyzed. An immunotherapy-related risk score (IRRS) signature was developed and validated for risk classification and prognosis prediction based on The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) cohorts. Besides, the predictive ability of the IRRS in immunotherapy response was also determined. Results: A total of 63 IRGs were identified, and 371 GC patients were stratified into two molecular subgroups with significantly different prognosis and immune characteristics. Then, an IRRS signature comprised of three IRGs (CENP8, NRP1, and SERPINE1) was constructed to predict the prognosis of GC patients in TCGA cohort. Importantly, external validation in multiple GEO cohorts further confirmed the universal applicability of the IRRS in distinct populations. Furthermore, we found that the IRRS was significantly correlated with patient’s responsiveness to immunotherapy, GC patients with low IRRS are more likely to benefit from existing immunotherapy. Conclusions: The risk score could serve as a robust prognostic biomarker, provide therapeutic benefits for immunotherapy and may be helpful for clinical decision making in GC patients.


INTRODUCTION
Gastric carcinoma (GC) is one of the most frequently diagnosed digestive system malignant tumor that ranks fifth for morbidity and fourth for tumor-related mortality around the world [1].The prognosis of GC patients is closely correlated with the pathological stage of the tumor, the 5-year survival rate for patients diagnosed with advanced pathological stage or distant metastatic carcinoma declines to approximately 5% because of missed the opportunity to be subjected to surgical treatment, and the median survival of those who did not receive adjuvant therapy is less than 1 year [2,3].However, a proportion of GC patients are diagnosed with advanced stage when first visit, which is responsible for the poor prognosis [4].Besides, some patients will suffer relapse from micro-metastatic lesions that have disseminated at the time of surgery.More disconcertingly, because the heterogeneity in individuals is great, only a subset of GC patients is sensitive to non-surgical management, including chemotherapy, radiotherapy, and targeted therapy, and many who initially respond develop resistance over time [5].The traditional risk classification is mainly dependent on the tumor, lymph node, metastasis (TNM) staging system, which ignores the great heterogeneity of the primary tumor.Thus, exploring novel biomarkers to accurately forecast the progression and prognosis of GC is urgently needed.
Nowadays, immunotherapeutic techniques have achieved great success as anti-cancer therapy for many types of malignant tumors through reactivating host immunity against tumor cells [6].It is widely accepted that tumor microenvironment (TME) act as a decisive role in initiation and progression of malignant tumors.Immune dysfunction of the host significantly impaired the body's anti-tumor immunological surveillance, which lead to the development of cancer [7].Among the distinct developed immunotherapeutic strategies, immune checkpoint inhibitors (ICIs) targeting programmed death 1 (PD-1), programmed death ligand 1 (PD-L1) and cytotoxic T lymphocyte-associated antigen-4 (CTLA-4) have shown promising and durable clinical responses in several solid tumors [8].PD-1 is a common immunosuppressive member on the membrane of immune cell, especially the activated T cell.PD-L1 is the main ligand of PD-1, which physiologically inhibit excessive immune responses by activating PD-1 to prevent normal tissue from damage [9].However, this protective mechanism is disrupted in cancer-immunity cycle, where the overexpressed PD-L1 binds to PD-1, inhibits the activation of T cell including through induction of T cell apoptosis, thereby inhibiting the anti-cancer immunity [10].Immunotherapy blocking the PD-1/PD-L1 regulation axis can efficaciously inhibit its tumor-promoting activity, and the inhibitors of PD-1/PD-L1 have shown clinical efficacy in many tumors over the past decade.For example, pembrolizumab, a humanized IgG4 kappa anti-PD-1 antibody, was approved by the Food and Drug Administration (FDA) for treating metastatic melanoma and advanced urothelial carcinoma [11].Anti-PD-L1 antibodies, atezolizumab and durvalumab, have been approved for treating non-small-cell lung cancer as first-line/secondline treatment strategies [12].In terms of GC, recent advance in understanding the TME of cancer has significantly facilitated the development of immunotherapy for advanced GC.National Comprehensive Cancer Network (NCCN) guidelines recommend pembrolizumab plus trastuzumab for first-line treatment of HER-2 positive metastatic GC based on the results of KEYNOTE-811 phase III trial [13].In addition, immunotherapy has been included in the first-line/ second-line treatment of GC in 2021 by the Chinese Society of Clinical Oncology (CSCO) [14].On the contrary, a phase III trial (KEYNOTE 062) found that chemotherapy combined with ICIs cannot improve the overall survival (OS) of patients with advanced GC compared with chemotherapy alone [15].Currently, biomarkers used to identify anti-PD-1/PD-L1 therapy responders mainly include microsatellite instability and PD-L1 expression, with only a subset of patients showing clinical responses.Moreover, many who initially respond develop resistance over time.Therefore, the development of novel molecular biomarkers to forecast clinical responses and identify suitable patients who benefit from anti-cancer immunotherapy is of great clinical significance.
The classification of patients by applying bioinformatic analysis based on next-generation sequencing is a novel strategy that can identify the immune characteristics and predict the prognosis of tumors, and some studies have been launched to acquire a better understanding of the TME and immunotherapeutic responsiveness of GC.However, there has been no appropriate research that construct a prognostic model to predict the survival outcomes and immune characteristics of GC based on immunotherapy-related genes (IRGs).Besides, clinical application of previously identified signatures is limited due to lack of validation with external dataset.In this study, we first identified IRGs in GC through analyzing immunotherapy cohort.Subsequently, we used unsupervised clustering method to identify immunotherapyrelated subtypes and compared the differences in TME and immunotherapy response between the two clusters.Finally, a three-gene signature for GC was developed to predict prognosis and immunotherapy responsiveness.Importantly, its predictive abilities for prognosis and treatment efficiency of immunotherapy were evaluated in distinct external validation cohorts.

Data acquisition
The immunotherapy cohort of gastric carcinoma (PRJEB25780) was obtained from the Tumor Immune Dysfunction and Exclusion (TIDE, http://tide.dfci.harvard.edu/)database to identify immunotherapyrelated genes [16,17].The RNA-sequencing profiles and corresponding clinicopathological information of GC were obtained from The Cancer Genome Atlas (TCGA, https://portal.gdc.cancer.gov/)[18].Detailed information of somatic mutation and copy number variation (CNV) data files were retrieved for further analysis.The immunophenoscore (IPS) of GC samples was obtained from the Stomach Adenocarcinoma (STAD) project of The Cancer Immunome Atlas (TCIA, https://tcia.at/)[19].Four cohorts (GSE15459 [20], GSE84437 [21], GSE62254 [22], and GSE26253 AGING [23]) were collected from Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo/) to validate the predictive efficiency of the risk score signature in prognosis prediction [24].IMvigor210 cohort includes RNA-sequencing profiles and detailed clinical parameters of urothelial carcinoma patients who received anti-PD-L1 therapy [25].GSE78220 is an immunotherapy cohort where patients with melanoma were treated with anti-PD-1 agents [26].We collected IMvigor210 and GSE78220 cohorts for evaluating the predictive efficiency of the risk score signature in immunotherapy response.The raw count data obtained from GEO database were normalized by applying the "limma" package in R [27].

Identification of IRGs in GC
The IRGs were defined as genes that were differentially expressed between immunotherapeutic responder and non-responder groups.In our study, patients with GC in PRJEB25780 cohort were divided into responder and non-responder groups according to their responsiveness to anti-PD-1 immunotherapy.Then, the "limma" package in R was applied to screen out IRGs, the cutoff conditions were set to adjusted P-value < 0.05 and absolute value of log2 fold change (log2 FC) ≥ 0.585.Besides, we performed Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis to further clarify the potential functional annotation of the IRGs based on the Database for Annotation, Visualization and Integrated Discovery (DAVID, http://david.abcc.ncifcrf.gov/)database [28].Somatic mutations and CNV are important events leading to genetic alterations.Thus, we first analyzed the overall genetic alteration atlas of the IRGs based on cBioPortal (https://www.cbioportal.org)platform [29].Then, the detailed mutation and CNV information of the IRGs were analyzed.

Consensus clustering analysis of IRGs
The "ConsensusClusterPlus" package with 1000 repetitions was used for performing unsupervised clustering analysis to identify distinct molecular subtypes in the TCGA cohort based on IRGs expression [30].Then, differences in prognosis and clinicopathological parameters between subgroups were analyzed.

Correlations between the subtypes and TME infiltration
The "CIBERSORT" algorithm was used to quantify the infiltrated abundances of 22 immunocytes for each GC patient with TCGA expression data [31].The "ESTIMATE" algorithm was used to evaluate the purity of tumors from different clusters [32].We compared the differences in immunocyte and immune score between different clusters to verify the immune characteristics of the IRG clusters.In addition, we applied a series of indexes which was usually used for predicting the response to immunotherapy, including tumor mutation burden (TMB), immune checkpoint biomarkers (ICBs), TIDE score, microsatellite instability and IPS, to assess the correlation between the subtypes and the effect of immunotherapy.
Creating and confirming the predictive risk score signature Differentially expressed genes (DEGs) between the immunotherapy-related subgroups were identified using the "limma" package in R, the threshold was set as adjusted P-value < 0.05 and absolute value of log2 FC ≥ 0.585.Similarly, gene ontology (GO) enrichment analysis and KEGG pathway enrichment analysis were conducted to clarify the pathways that were considerably enriched.Subsequently, univariate Cox and LASSO regression analyses were carried out to screen out the optimal prognostic biomarkers among these DEGs and included them in the immunotherapyrelated risk score (IRRS) signature.The following formula can calculate the risk score of all cases: , where Exp is the expression value of the gene, and Coef is the LASSO regression analysis coefficient of each gene in the signature.Kaplan-Meier survival analysis was utilized to explore the predictive ability of the IRRS in GC prognosis.Area under the receiver operating characteristic (ROC) curve was utilized to evaluate the diagnostic efficacies.Univariate and multivariate Cox proportional hazards regression analyses were used to assess whether the signature could be served as an independent prognostic factor.Besides, the correlations between the IRRS and the clinicopathological parameters, including age, gender, grade, stage, and microsatellite instability, were determined using chi-square tests.
Importantly, the risk score was also calculated in four external cohorts (GSE15459, GSE84437, GSE62254, and GSE26253) to validate the predictive ability of the IRRS signature in distinct populations.

Establishment of a nomogram
To enhance the clinical utility of the risk signature, a nomogram containing the IRRS and other independent prognostic factors was constructed using the "rms" package in R, and then was applied to predict the 1-, 3-, and 5-year survival rate of GC patients.ROC and calibration curves were used to evaluate the accuracy of the nomogram.

The role of IRRS in the prediction of immunotherapeutic benefits
We compared the differences in several immunotherapeutic response indexes (ICBs expression, TMB score, TIDE score, microsatellite instability and IPS) between the risk groups to assess the predictive ability of the risk signature in the prediction of immunotherapeutic benefits.Patients in IMvigor210 and GSE78220 cohorts were classified into responder (including partial response (PR) and complete response (CR)) and non-responder (including progressive disease (PD) and stable disease (SD)) groups according to the patients' response to immunotherapy.Then, we calculated the risk score of each patient in IMvigor210 and GSE78220 cohorts based on the formula generated in TCGA cohort and analyzed its impact on the prognosis and the efficacy of immunotherapy.

Statistical analysis
All visualization and statistical analyses were performed by using R software (version 4.2.1, https://www.rproject.org/)and the corresponding feature packages.Two-sided P-value < 0.05 was considered as significant thresholds for all statistical tests.

Identification of IRGs in GC
The RNA-sequencing profiles and corresponding clinical parameters of GC patients in PRJEB25780 dataset were obtained from the TIDE database.Then, a total of 63 genes that differentially expressed between the responder and non-responder groups were identified as IRGs (Supplementary Table 1).
The volcano plot of the IRGs was displayed in Figure 1A, 1B presented the expression heatmap of these IRGs.KEGG pathway annotation analysis revealed that the IRGs were primarily enriched in Pathways in cancer, Cytokine-cytokine receptor interaction, Rap1 signaling pathway, and PD-L1 expression and PD-1 checkpoint pathway in cancer (Figure 1C).

Genetic variation landscapes of IRGs in GC
The somatic mutation and CNV frequencies of the 63 IRGs in GC patients were analyzed based on TCGA cohort.As presented in Figure 2A, 2B, genetic alterations of the IRGs occurred in 302 of 434 (69.59%)GC patients.Among them, FAT4 (19%), DCHS1 (7%), NID1 (5%), and JAKMIP1 (5%) possess the highest mutation frequency, and their main mutation types are missense mutation, frame shift del, and nonsense mutation.The changes in IRGs with CNV features on chromosomes was showed in Figure 2C, 57 out of the 63 IRGs had frequent copy number alterations, and most of the IRGs were accumulated on copy number loss rather than copy number gain (Figure 2D).

Identification of IRG subtypes in GC
An unsupervised clustering strategy was used to classify 371 GC samples into cluster 1 (n=184) and cluster 2 (n=187) subgroups based on the expression patterns of 63 IRGs (Figure 3A).The principal component analysis (PCA) revealed that the discrimination between cluster 1 and cluster 2 is great (Figure 3B).Kaplan-Meier survival analysis revealed that patients in cluster 1 possess better OS and progression free survival (PFS) rate than those in cluster 2 (Figure 3C, 3D).The correlations between the clinical features and expression patterns of the IRGs was presented in the heatmap (Figure 3E).Patients of cluster 1 had a higher proportion of microsatellite instability-high (MSI-H) and alive outcome than cluster 2.

Analysis of TME infiltration in distinct subtypes
We carried out the "CIBERSORT" and "ESTIMATE" methods to determine the difference in TME infiltration, including 22 subtypes of immune cells and TME scores, between the two clusters in order to learn more about how IRGs work in the TME.As presented in Figure 4A, the difference in TME infiltration between these two clusters is great.Samples in cluster 1 seemed to exhibit remarkably lower stromal scores, lower immune scores, and higher tumor purity scores, compared with those of cluster 2 (Figure 4B).Kaplan-Meier survival analysis revealed that high stromal scores and low tumor purity scores were remarkably associated with poor prognosis in GC patients (Figure 4C).With regard to immune cell infiltration, the cluster 1 subgroup was characterized by the high infiltration of follicular helper T cells, activated memory CD4+ T cells, CD8+ T cells and M1 macrophages, whereas the cluster 2 was characterized by the high infiltration of resting memory CD4+ T cells, naive B cells, resting Dendritic cells, Monocytes, resting Mast cells, and Eosinophils (Figure 4D, 4E).Among them, elevated infiltration abundances of CD8+ T cells, follicular helper T cells and activated memory CD4+ T cells were significantly correlated with  favorable clinical outcome of GC, while elevated infiltration abundance of naive B cells was unfavorable (Figure 4F).
Subsequently, we determined whether the IRG subtypes had a significant correlation with the immunotherapy effect.Patients in cluster 1 possess higher TMB score than those in cluster 2, whereas the cluster 2 possess higher TIDE score (Figure 5A, 5B).Besides, the cluster 1 was characterized by the high proportion of MSI-H and low proportion of microsatellite stability (MSS), compared with cluster 2 (Figure 5C).In terms of ICBs, CD274 (also known as PD-L1) and LGALS9 expression in cluster 1 were significantly elevated, while the expression of CD276, HAVCR2, TIGIT, and PDCD1LG2 in cluster1 were downregulated (Figure 5D).Moreover, we calculated the IPS (CTLA4-/PD-1-, CTLA4+/PD-1-, CTLA4-/PD-1+ and CTLA4+/PD-1+) to reveal the immunogenicity of patients in each subgroup.As a result, the IPS were remarkably elevated in the cluster 1 subgroup, which appeared to have stronger immunogenicity.(Figure 5E).In sum, these findings implied that effective immune response was more activated in the cluster 1 group, which mean that the cluster 1 group exhibited a better response to immunotherapy.

Construction of the IRRS signature
As listed in Supplementary Table 2, 3065 DEGs (|log FC| ≥ 0.585, adjusted P < 0.05) between cluster 1 and cluster 2 subgroups were identified, and the expression heatmap of these DEGs was presented in Figure 6A.GO annotation analysis revealed that the DEGs are significantly annotated in immune-associated crosstalk, such as immune response, adaptive immune response, B cell receptor signaling pathway, and phagocytosis (Figure 6B and Supplementary Table 3).KEGG pathway annotation analysis found that the DEGs are primarily enriched in several areas of carcinogenesisassociated pathways, including ECM-receptor interaction, Focal adhesion, PI3K-Akt signaling pathway, and Pathways in cancer (Figure 6B and Supplementary Table 4).Subsequently, 625 genes presented significant correlations with the OS of GC patients were filtered out for further research using univariate Cox analysis (Supplementary Table 5), and three of which were eventually screened out for constructing the IRRS signature via utilizing LASSO regression analysis with minimized lambda (Figure 6C, 6D).The forest plot illustrated the correlations between the expression levels of the three genes in the IRRS signature and the prognosis of GC patients (Figure 6E).As presented in Figure 6F, we applied the following equation to calculate the risk score of each patient: IRRS = (ExpNRP1 × 0.0354) + (ExpSERPINE1 × 0.0478) + (ExpCPNE8 × 0.0749).According to the median value of the risk score, 371 samples in the TCGA GC cohort were divided into the low-(n = 186) and high-risk (n = 185) groups, and the association of IRRS, IRG subtypes, and vital status of these patients was showed in the Sankey plot (Figure 6G).

IRRS for the prognostic prediction of GC
We determined the clinical significance of the IRRS in GC based on TCGA cohort.The Kaplan-Meier survival analysis revealed that patients in the high-risk subgroup AGING possessed a significant poorer OS than patients in the low-risk subgroup (Figure 7A).The ROC curve for 5-year OS of the IRRS signature exhibited an area under ROC (AUC) value of 0.721, implying that the signature had moderate sensitivity and specificity (Figure 7B).Patients in the high-risk subgroup possessed a higher proportion of histologic G3, MSS, and deceased outcome than those in the low-risk subgroup, and the results of the chi-square test revealed that NRP1, CPNE8 and SERPINE1 were all overexpressed in the high-risk subgroup (Figure 7C).Besides, we found that patients' risk of death increased with increasing risk score (Figure 7D).Furthermore, we conducted a subgroup analysis on the basis of age, gender, histologic grade, and TNM stage.As expected, the OS of GC patients was better in the low-risk subgroup than in the high-risk subgroup in each subgroup (Figure 7E).

Validation of the IRRS in external cohorts
We performed validation analysis for evaluating whether the IRRS has clinical application value in  probability were observed between the low-and high-risk subgroups in each validation cohort, GC patients with higher risk score seemed to exhibit poorer prognosis.In addition, we further determined whether the IRRS has predictive ability in relapse-free survival (RFS) of GC patients from GSE26253 cohort.Consistent with the results in OS, patients with higher risk score possessed higher rates of tumor recurrence (Figure 9).To sum up, these findings indicated the reliable capability of the IRRS in distinguishing differences in clinical outcomes of GC.

The IRRS can be served as an independent risk predictor in GC
To investigate whether the IRRS can be served as a clinically independent risk predictor for GC patients, the IRRS, age, gender, histologic grade, and TNM stage were enrolled as covariates to perform the univariate and multivariate Cox analyses.The results indicated that age, TNM stage and IRRS are independent prognostic predictor that could be used for predicting the prognosis of GC patients (Figure 10A, 10B).Then, a nomogram was built to improve the clinical power of the IRRS by combining the prognostic factors above (Figure 10C).Multi-variables ROC analysis indicated that the nomogram had optimum predictive performance compared with other single factors (Figure 10D).Time-dependent ROC of 1-, 3-, and 5-year OS also revealed that the nomogram had high predictive power (Figure 10E).Besides, calibration charts revealed that the predictive capability of the nomogram in 1-, 3-, and 5-year periods was highly accurate, confirming its utility in forecasting the clinical outcome of GC patients (Figure 10F).

Risk score-based treatment strategy for GC
The expression of four ICBs (PD-L1, HAVCR2, TIGIT, and CTLA4) were compared between the low-and high-risk subgroups.As shown in Figure 11A, the results found that the presented immunomodulators were significantly upregulated in the high-risk subgroup.Subsequently, we determined the association between the IRRS and IPS (CTLA4-/PD-1-, CTLA4+/ PD-1-, CTLA4-/PD-1+ and CTLA4+/PD-1+), which are recognized indexes to predict patients' responses to ICIs by evaluating the immunogenicity.The results demonstrated that the IPS were remarkably elevated in the low-risk subgroup, implying that GC patients in the low-risk subgroup might be more sensitive to anti-cancer immunotherapy (Figure 11B).We also compared the distribution of TMB and TIDE scores between the two IRRS subgroups and revealed that the TMB score was lower in the high-risk subgroup, and the TIDE score was contrary, indicating that patients in the high-risk subgroup might respond worse to anti-cancer immunotherapy (Figure 11C, 11D).Besides, we explored the correlation between IRRS and microsatellite instability, and revealed that patients in the low-risk subgroup possess a higher percentage of MSI-H, and MSS more happened in patients in the high-risk subgroup (Figure 11E).In sum, all these results implied that GC patients in low-risk subgroup exhibited a better response to anticancer immunotherapy, IRRS might be a potential biomarker for immunotherapy of GC.
Subsequently, IMvigor210 and GSE78220 cohorts were applied as external validation cohort for verifying the predictive ability of the IRRS in anti-tumor immunotherapy.In the IMvigor210 cohort, patients in the low-risk subgroup had remarkably better prognosis than those in the high-risk subgroup (Figure 11F).Besides, we analyzed the differences in immunotherapy response subtypes (CR, PR, SD, and PD) between different IRRS subgroups and found that patients with lower risk score possess a higher percentage of CR/PR, and SD/PD more happened in patients with higher risk score (Figure 11G, 11H).These findings further confirmed that the IRRS can be served as a great biomarker to forecast the sensitive to immunotherapy.

DISCUSSION
Despite great advancements in treatment of GC, the interpatient heterogeneity still poses a great challenge for forecasting the clinical outcome of the patients and adopting appropriate treatment strategies.Cancer immunotherapy is a rapidly developing research field which brings new hope to distinct tumor patients, In this study, our goal is distinguishing the immune subtypes of GC to develop an immune landscape for choosing appropriate patients for immunotherapy and construct a novel risk model for forecasting the prognosis of GC patients.We first identified IRGs in GC by exploring the RNA-sequencing profiles of immunotherapy cohort.Subsequently, the GC patients were divided into two subtypes by applying unsupervised clustering method based on the expression pattern of the IRGs.Our results revealed that patients in cluster 1 had remarkably prolonged OS and PFS time than those in cluster 2. The TME is comprised of immune cells, stromal cells, fibroblasts, endothelial cells as well as tumor cells that closely interact with tumor prognosis and treatment options [33].Thus, we further explored the correlation between two IRG subtypes and TME infiltration considering the significant role of TME.As a result, patients in cluster 1 seemed to exhibit remarkably lower stromal scores, lower immune scores, and higher tumor purity scores, compared to those of cluster 2. On the other hand, tumor immune infiltration analyses found that the infiltration abundances of activated memory CD4+ T cells, CD8+ T cells, follicular helper T cells and M1 macrophages are remarkably higher in cluster 1, consistent with the higher infiltration of CD8+ T cells, activated memory CD4+ T cells, and follicular helper T cells being correlated with improved prognosis in patients with GC, whereas a significantly higher infiltration abundances of naive B cells, resting memory CD4+ T cells, Monocytes, resting Dendritic cells, resting Mast cells, and Eosinophils in cluster 2 corresponds to the results that the elevated infiltration abundance of naive B cells was correlated with worse clinical outcome of GC patients.CD8+ T cells can specifically detect and eliminate cancer cells by secreting effector cytokines (tumor necrosis factor (TNF) and interferon-γ (IFNγ)) or cytotoxic molecules (such as perforin and granzymes) [34].Follicular helper T cells are a subgroup of CD4+ T cells, which induced anti-tumor immunity by facilitating the differentiation and maturation of tumor-killing cells [35].M1 macrophages are a subtype of tumor-associated macrophages, which have anti-tumor effects by macrophage-mediated cytotoxicity or antibody-dependent cellmediated cytotoxicity (ADCC) [36].Activated naive B cells were reported to promote the progression of malignant pleural effusion via the PD-1/PD-L1 regulation axis [37].Monocytes can promote the dissemination of tumor cells through inducing immune tolerance and angiogenesis [38].Taken together, these results seemed to imply that cluster 1 was characterized by immune activation due to the high infiltration abundances of CD8+ T cells, follicular helper T cells, and M1 macrophages, and leads to a better prognosis, whereas cluster 2, with high infiltration of naive B cells and Monocytes, was characterized by immunosuppression and therefore giving rise to a poor prognosis.
Considering the impact of IRG cluster subgroups on the immune characteristics and clinical outcomes of GC patients, we further developed a three-gene (NRP1, CPNE8, and SERPINE1) signature based on DEGs between the two IRG cluster subgroups to predict the prognosis of GC.NRP1, one kind of transmembrane glycoprotein, has been well-described to participate in various biological processes, such as cell migration, cardiovascular development, angiogenesis, neuronal guidance, and immunology [39].NRP1 expression levels were significantly upregulated in GC tissues and had positive correlation with the advanced tumor stage and worse clinical outcomes in GC patients [40].NRP1 knockdown was reported to suppress the migration and invasion abilities of GC cells in vivo.CPNE8, a member of the copine family that has been reported to be overexpressed in GC, was positively correlated with aggressive clinical features, poor prognosis, and poor efficacy of immunotherapy in GC [41].SERPINE1 is a member of the Serine protease inhibitor family and a main regulator of the plasminogen activator system [42].SERPINE1 was overexpressed in GC and remarkably associated with advanced tumor stage and unfavorable prognosis, with the knockdown of SERPINE1 remarkably inhibiting the proliferation, invasion, and metastasis of GC cells in vivo and in vitro [43].In our study, we combined the expression patterns of these three genes for developing a IRRS signature to predict the prognosis of GC patients.As a result, patients were classified into high-and low-risk subgroups according to the IRRS calculated using the risk genes.Survival analysis indicated that patients in the high-risk subgroup had worse clinical outcomes than those in the low-risk subgroup.Excitingly, external validation in four independent cohorts for verifying the universal applicability of the IRRS further affirmed that patients with low IRRS were accompanied by significantly prolonged OS or RFS time.In addition, univariate and multivariate Cox analyses found that the IRRS serves as an independent risk predictor in GC.In sum, these results implied that the IRRS can help clinicians develop personalized therapeutic strategies for GC in the future.For instance, patients with high IRRS required continual follow-up to monitor the relapse of GC.
Nowadays, a series of biomarkers for predicting the responsiveness of immunotherapy for cancer have been identified, including ICB, TMB, TIDE score, microsatellite instability and IPS.The ICBs are a class of inhibitory or stimulatory molecules mainly expressed on immune cells, which are crucial for inducing the selftolerance and regulating the immune responsiveness of effectors in different tissues to avoid the tissue damage [44].However, in the immunosuppressive TME, tumor cells can restrain the activity of immune cells and achieve immune escape by up-regulating the expression of ICBs on immune cells [45].In the past two decades, emerging studies confirmed that the blockade of ICBs (PD-1, PD-AGING L1, CTLA4, TIM-3, TIGIT etc.) has rapidly become the most promising anti-cancer strategies for multiple types of cancer, including GC [13,14,46].The TMB is defined as the number of mutations seen in a section of DNA in a tumor cell and reported as mutations per megabase (mut/Mb) [47].High TMB may contribute to the generation of new antigens that can be detected by immune cells, and thereby triggering anti-tumor immune response [48].Thus, TMB has been recognized as a potential biomarker for cancer patients treated with ICIs, and the FDA has approved pembrolizumab in solid cancers with high TMB (defined as10 mut/Mb) [49].The TIDE algorithm is developed for predicting the potential clinical responses to ICI treatment by integrating the gene expression signatures of T cell dysfunction and T cell exclusion [17,50].High TIDE scores indicate poor efficacy of ICI therapy, and TIDE has been demonstrated to be more accurate than the ICB expression levels and TMB at predicting clinical outcomes in cancer patients receiving immunotherapies [51].In terms of microsatellite instability, it is known that MSI-H leads to accumulation of somatic mutations in tumor cells, thereby causing a battery of molecular and biological variations including high TMB, enhanced expression of new antigens and abundant tumor-infiltrating lymphocytes (TIL) [52].Thus, patients with cancer in the MSI-H group had significantly increased sensitivity to immunotherapy in contrast to the MSS/MSI-L group, and the FDA granted pembrolizumab for the treatment of all MSI-H solid tumors, including GC [53].IPS is defined based on tumor immune infiltration characteristics and bridges immune cell infiltration with immunogen subtypes, which can forecast the response to immunotherapy strategies including CTLA4 and PD-1 inhibitors [19].Higher IPS scores are positively associated with the increased immunogenicity [54].In this study, we revealed that the IRRS was remarkably associated with ICBs expression, TMB, TIDE score, microsatellite instability and IPS, which indirectly implied that the IRRS may serves as a significant role in forecasting the responsiveness of anticancer immunotherapy.Especially, we found that the IRRS can be used as an effective predictor for predicting the effects of immunotherapy in urothelial carcinoma cohort with anti-PD-L1 treatment (IMvigor210) and malignant melanoma cohort with anti-PD-1 treatment (GSE78220).This evidence further confirmed that the IRRS can be served as a great biomarker to predict the response to anti-cancer immunotherapy.
This study inevitably existed several limitations.Firstly, although we have developed a novel prognostic model on the basis of IRG and verified its applicability in external cohorts, the correlation between its members and immunotherapy remains largely unclear, functional and mechanistic experiments are required to verify and explain the regulatory mechanisms of these genes on immunotherapy in GC.Secondly, since the data analyzed in our study were retrospectively collected from different databases, it is difficult to cover all variations among patients from different regions due to interpatient tumor heterogeneity.Thus, a well-designed, prospective, multicenter study is required to further validate the accuracy of the IRRS signature, which will be time-consuming.

CONCLUSIONS
In conclusion, this study identified immunotherapyrelated genes in GC based on immunotherapy cohort and analyzed the role of these genes in GC prognosis and correlation with TME and developed a prognosis prediction signature.

Figure 1 .
Figure 1.Identification of IRGs.(A) Volcano plot of IRGs.The red dots represent the upregulated genes, the green dots represent the downregulated genes, and the black dots represent genes with no significant difference in expression.(B) Expression heatmap of the IRGs.Red represents upregulated genes, and blue represents downregulated genes.(C) KEGG enrichment analysis of the IRGs.

Figure 2 .
Figure 2. The alterations of IRGs in GC. (A) Gene alteration frequency of IRGs in GC based on TCGA project.(B) Landscape of genomic aberrations of the top 20 IRGs in GC. (C) The sites of CNV variation in IRGs on the 23 chromosomes.(D) Frequencies of CNV gain, loss, and non-CNV among IRGs.

Figure 3 .
Figure 3. Identification of IRG subtypes in GC. (A) Consensus matrix heatmap defining two clusters (k = 2) and their correlation area.(B) A considerable transcriptome divergence between the two subtypes is seen by PCA analysis.(C) The Kaplan-Meier curve of OS analysis between different cluster groups.(D) The Kaplan-Meier curve of PFS analysis between different cluster groups.(E) Differences in clinicopathologic features and expression levels of IRGs between the two subtypes.

Figure 4 .
Figure 4. Immune landscape of IRG subtypes in the TCGA cohort.(A) Differences in TME scores and infiltration levels of immune cells between the two distinct subgroups.(B) The comparisons of stromal score, immune score, and tumor purity between different subgroups.(C) The Kaplan-Meier survival regarding the stromal score, immune score, and tumor purity in GC patients.(D) The boxplot of the differences of the immune cells infiltration between two distinct subgroups.(E) Relative proportion of immune infiltration in two distinct subgroups.(F) The Kaplan-Meier survival analysis of the correlation between the infiltration levels of the immune cells and OS in GC. *P < 0.05, **P < 0.01, ***P < 0.001.

Figure 5 .
Figure 5.The estimation of two IRGs subtypes in immunotherapy response.(A) The difference in TMB between two distinct subtypes.(B) The difference in TIDE score between two distinct subtypes.(C) Relative proportion of microsatellite instability in two distinct subtypes.(D) The expression of ICBs in two distinct subtypes.(E) The difference of IPS in two distinct subtypes.

Figure 6 .
Figure 6.Construction of the IRRS model in the TCGA cohort.(A) Expression heatmap of the DEGs between the two distinct subtypes.(B) GO and KEGG enrichment studies of DEGs between different cluster groups.(C) LASSO coefficient profiles of 3 selected genes in the 10fold cross-validation.(D) Partial likelihood deviance was revealed by the LASSO regression model in the 10-fold cross-validation.(E) Forest plot of hazard ratios for three selected prognostic variables.(F) The coefficients of the three prognostic variables in the LASSO regression model.(G) Sankey plot displayed the correlations among the cluster subgroups, IRRS and vital status.

Figure 7 .
Figure 7. Clinical significance of the IRRS model in TCGA cohort.(A) The Kaplan-Meier survival analysis of the signature for predicting the OS of patients in TCGA cohort.(B) Time-dependent ROC analysis of the signature for predicting the OS of patients in TCGA cohort.(C) Differences in clinicopathologic features and expression levels of prognostic variables between the low-and high-risk groups.(D) The distribution of the IRRS and the vital status of patients in the TCGA cohort.(E) The subgroup survival analysis according to the age, gender, histologic grade, and tumor stage.

Figure 8 .
Figure 8. External validation of the IRRS model in predicting OS of GC patients based on three independent cohorts.(A) The distribution of the risk score, the vital status of patients, and the expression heatmap of three prognostic variables in the GSE15459 cohort.(B) The distribution of the risk score, the vital status of patients, and the expression heatmap of three prognostic variables in the GSE84437 cohort.(C) The distribution of the risk score, the vital status of patients, and the expression heatmap of three prognostic variables in the GSE62254 cohort.(D) The Kaplan-Meier survival analysis of the IRRS signature for predicting the OS of patients in the GSE15459 cohort.(E) The Kaplan-Meier survival analysis of the IRRS signature for predicting the OS of patients in the GSE84437 cohort.(F) The Kaplan-Meier survival analysis of the IRRS signature for predicting the OS of patients in the GSE62254 cohort.

Figure 9 .
Figure 9. External validation of the IRRS model in predicting RFS of GC based on the GSE26253 cohort.(A) The Kaplan-Meier survival analysis of the IRRS signature for predicting the RFS of GC patients based on the GSE26253 cohort.(B) The distribution of the risk score of GC patients in the GSE26253 cohort.(C) The expression heatmap of three prognostic variables in the GSE26253 cohort.(D) The distribution of the recurrent status of GC patients in the GSE26253 cohort.

Figure 10 .
Figure 10.Nomogram developed for predicting the probability of 1-, 3-and 5-year OS in TCGA cohort.(A) Univariate Cox analysis containing IRRS and clinicopathological parameters.(B) Multivariate Cox analysis containing IRRS and clinicopathological parameters.(C) The comprehensive nomogram for predicting probabilities of GC patients with 1-, 3-and 5-year OS in TCGA dataset.(D) A comparison of ROC curve showed the superiority of the nomogram.(E) ROC curves chart of the comprehensive nomogram predicting the 1-, 3-and 5-year survival rate.(F) Calibration plot of nomogram for predicting probabilities of 1-, 3-, and 5-year survival of GC patients.Nomogram-predicted probability of survival is plotted on the x-axis; actual survival is plotted on the y-axis.

Figure 11 .
Figure 11.The immunotherapeutic benefit of the IRRS model.(A) The expression of four ICBs (PD-L1, HAVCR2, TIGIT, and CTLA4) in low-and high-risk groups.(B) The difference in IPS scores between low-and high-risk groups.(C) Box plots and scatter diagram of the TMB score and IRRS.(D) Box plots and scatter diagram of the TIDE score and IRRS.(E) Boxplot and Bar diagram of the microsatellite instability and IRRS.(F) Kaplan-Meier curve of OS for patients with high and low IRRS subtypes in IMvigor210 cohort.(G) Boxplot and Bar diagram displayed the response to immunotherapy in low-and high-risk groups in IMvigor210 cohort.(H) Boxplot and Bar diagram displayed the response to immunotherapy in low-and high-risk groups in GSE78220 cohort.
The prognostic prediction signature exhibits compelling clinical values in forecasting the prognosis and immunotherapy responsiveness for GC, which can be used as a powerful index for prognosis prediction and treatment guidance.AGING manuscript.QY agreed to be responsible for all aspects of the work to ensure that issues of accuracy or completeness of the study were properly investigated and addressed.All authors read and approved the final manuscript.AGING