A novel immune-related microRNA signature for prognosis of thymoma

Introduction: Immune microenvironment and microRNAs serve as common predictors for diagnosis and prognosis of tumors. Methods: Expression of 122 genes and 126 microRNAs in thymoma was obtained from TCGA database. The proportion of tumor-infiltrating cells was calculated, and IMRS was constructed. TREM2hi score was calculated before functional enrichment analysis on gene sets. Results: IMRS3, TREM2hi score, and CD8+ T lymphocyte abundance were significantly different among WHO classifications. WHO classification, Masaoka staging, and miR-130b-5p, miR-1307-3p, miR-425-5p, CD8, CD68, and CCL18 expression were prognostic factors for relapse-free survival and overall survival. IMRS3 upregulation polarized macrophages into M2, which rejected CD8+ T and other effector lymphocytes to promote thymoma malignant progression. Conclusions: BRRS may present a novel immune-related microRNA signature for TET prognosis.


INTRODUCTION
Thymic epithelial tumors (TETs) originate from thymic epithelial cells with a relatively low incidence of approximately 0.13/100,000 [1]. Currently, the recurrence and survival prognosis of TETs mainly rely on the World Health Organization (WHO) classification and Masaoka staging [2]. According to 2015 WHO classification, TETs can be divided into thymoma (types A, AB, B1, B2, and B3) and thymic carcinoma (type C) [3]. Types A, AB, B1, B2, and B3 TETs present a low malignancy, less recurrence, and good survival prognosis. In contrast, type C TETs are considered highly malignant and easy to relapse, together with a poor survival prognosis. In Masaoka staging, stages I-II are deemed to have a low risk of recurrence and a good survival prognosis, while stages III-IV present a high risk of recurrence and a poor survival prognosis. However, there are certain limitations to predicting recurrence and survival prognosis of TET patients as per the WHO classification and Masaoka staging system in a clinical setting. For example, recurrence has also been noted for type A, AB, or B1 thymomas with a low malignancy; early metastasis, among other factors,

Identification of microRNAs influencing macrophage function on progression-free interval
In the TCGA thymoma datasets, PFI (progression-free interval) data and cell abundances inferred from EPIC (electronic Publication Information Center) were available for 97 tumor samples. Macrophages were the most pronounced cell population for the prognosis of PFI (Supplementary Table 2). Based on the macrophage abundance obtained by EPIC, 16 microRNAs and 850 genes were identified as significantly interacting with macrophages to affect PFI (Supplementary Table 3). A total of 41 microRNA-mRNA pairs were retrieved from the anamiR analysis (Supplementary Table 4). From the expression levels of these microRNAs supported by the microRNA datasets, LASSO (least absolute shrinkage and selection operator) returned a model containing four microRNAs with non-zero coefficients ( Table 1). The baseline characteristics of the 119 samples with PFI data are shown in Supplementary Table 5 (sheet 'Baseline of TCGA TYHM'). Although WHO histology (C vs. A-AB), hsa-miR-1307-3p, hsa-miR-425-5p, and TREM2 hi signature were significantly associated with PFI, the effects of IMRS3 and IMRS4 were more conspicuous (Supplementary Table 5, sheet 'Univariate Cox regression'). Moreover, IMRS3 and IMRS4 were independent prognostic factors for PFI after adjustment for gender, age, WHO classification, Masaoka stage, and other individual biomarkers ( Table 2). Taking HR values and 95% CIs into consideration, IMRS3 may be as reliable as IMRS4 in terms of the prognosis of PFI.

Characteristics of immune microenvironment in thymoma
As shown in Figure 1, there was a clear trend that IMRS3 and the TREM2 hi signature increased with the increase in the WHO grades. However, the abundance of CD8 T cells exhibited a parabolic change. Detailed pairwise comparisons are shown in Supplementary  Table 6. Moreover, there was a significant positive correlation between IMRS3 and the TREM2 hi signature ( Figure 2). In addition, CCL18 (Chemokine (C-C motif) ligand 18), one of the components of the TREM2 hi signature, was also highly positively correlated with miR-1307-3p, miR-425-5p, IMRS3, and TREM2 hi signature itself ( Figure 2). Gene set enrichment analysis showed that tumors with high TREM2 hi signature had several critical immune pathways in addition to "interleukin-10 signaling", such as "interferon-gamma signaling", "antigen processing-cross presentation", and "interferon-alpha/beta signaling" (Supplementary Table  7). These results indicated a complex regulatory network of infiltrating monocyte differentiation in thymoma tissues. Importantly, IMRS3 as well as macrophages, especially TREM2 hi macrophages, may be essential for thymoma progression.

Correlation of microRNAs and immune-related biomarkers with clinical features
To further explore the roles of miR-130b-5p, miR-1307-3p, miR-425-5p, and the TREM2 hi signature in determining RFS of thymoma, these biomarkers were evaluated in an independent cohort of 99 patients. Baseline characteristics of the cohort are shown in Supplementary Table 8. Consistent with the TCGA thymoma dataset, a significantly higher proportion of patients with WHO type C or Masaoka stage III-IV thymoma had increased expression of CD68 and  (Table 3). Representative IHC images of CD8, CD68, and CCL18 are shown in Figure 3. Furthermore, the expression levels of miR-130b-5p, miR-1307-3p, and miR-425-5p were all markedly increased in WHO type C compared with A-AB or B1-B3 types (Table 4). These results confirmed that miR-130b-5p, miR-1307-3p, and miR-425-5p might be involved in the progression of thymoma.   Figure 4). Time-dependent ROC analysis revealed that BRRS predicted 5-year RFS with better accuracy than miR-425-5p or CCL18 alone (Table 6 and Figure 5). Moreover, the AUC of BRRS for the 5-year RFS was significantly higher than that of the Masaoka stage (AUC: 92.0 vs. 85.2, P=0.007). However, the difference in AUC for OS between CRS and Masaoka stage did not reach significance (Table 6).

DISCUSSION
TETs arise from thymic epithelial cells with a low morbidity rate. However, few studies have been done on     the treatment and prognosis of those patients presenting with TETs. Here, we developed BRRS for TETs using microRNA-mRNA pairs and tumor-infiltrating immune cells from the TCGA dataset. Our analysis of 97 tumor samples from the TCGA thymoma datasets indicated that macrophage infiltration abundance was the most pronounced prognostic factor for PFI. A total of 41 microRNA-mRNA pairs were further analyzed to obtain 4 microRNAs responsible for biomarkers of PFI. Then, we analyzed the clinical features and found that the prognostic indicators consisting of microRNAs were superior to the current clinical staging.
IMRS3 and CD8+ T cells were found related to the TREM2 hi score. In addition, we validated miR-435-5p, CD8, and CCL18 as independent predictors of RFS. In the 99 cases, BRRS was significantly related to the recurrence risk and death risk. Notably, after adjusting for gender, age, myasthenia gravis, Masaoka staging, WHO classification, radiotherapy, and chemotherapy, BRRS remained an independent prognostic factor for RFS. The tumor microenvironment has important interactions with cancer cells to determine outcomes, and especially immune-related microRNAs are important biomarkers for prognosis and treatment response [11]. The immune-related microRNA signature was strongly correlated with the immune microenvironment and clinical outcomes of hepatocellular carcinoma [12]. The expression of immune-related microRNAs was associated with the role of the tumor immune microenvironment in immunosuppressive therapy [13].
In this study, BRRS, independent of Masaoka staging, had a significant regression contribution to RFS, suggesting that BRRS may serve as a promising signature in addition to the current clinical prognostic indicators.
Additionally, we validated that IMRS3 is related to TREM2 hi and acts as a prognostic factor superior to the current clinical staging. Besides, IMRS of macrophages was significantly correlated with the thymoma PFI collected from the TCGA dataset. Immune-related genes and microRNAs have been widely studied, and they hold significate research value for cancer prognosis. Immunerelated genes can effectively predict the clinical outcome of hepatocellular carcinoma treatment and assess the clinical status of the tumor [14]. Guo et al. developed an immune-based prognostic model for renal cell carcinoma, validating that the immune-related microRNA signature is a feasible and reliable prognostic model [15]. Immunerelated microRNAs are non-invasive biomarkers of clinical outcomes and modulate the immunosuppressive tumor microenvironment in metastatic melanoma [11]. Immune-related microRNAs regulate immune cells directly [15] or may modulate the immune microenvironment through exosomes. Unlike the study in which prognostic markers were constructed with potential microRNAs, the present study applied a more indirect statistical-based approach to screen microRNAs that may modulate macrophage function. We successfully validated that miR-130b-5p, miR-1307-3p, and miR-425-5p in tumors were significantly related to  the WHO classification, and their increased expression levels may be associated with the development of immunosuppressive microenvironment in thymoma. All IMRS, WHO classification, and Masaoka staging were prognostic factors for RFS and OS. Therefore, we believe that immune-related microRNAs and IMRS may be vital contributors in predicting the prognosis of thymoma. In this sense, the immune-related microRNAs we identified differ from the "immune-related microRNAs" defined by microRNA species that directly affect gene regulation in immune cells.
In the 120 thymomas, IMRS3 and TREM2 hi scores were found significantly positively correlated, while TREM2 hi was negatively correlated with CD8+ T lymphocytes. Thymoma led to polarization of M2 macrophages due to high expression of IMRS3, which rejected CD8 T and other effector lymphocytes and promoted the malignant progression of thymoma. The high TREM2 hi score expression group was mainly characterized by immunoregulatory pathways such as the IL-10 signaling pathway and IFN-γ pathway. The expression of hsa-microRNA-1307-3p and hsa-miR-425-5p, which were significantly correlated with PFI in IMRS3, had a strong positive correlation with CCL18. Hence, we proposed that immune-related microRNAs may be a biomarker for the immunosuppressive tumor microenvironment through modulating IL-10 or IFN-γ-mediated signaling pathway. Immune-related microRNAs regulate immunoregulatory molecules, which mediate immune escape from cancer and result in a poor prognosis. Immunoregulatory microRNAs, as a novel mechanism, can reverse immune escape from tumors [16]. Proinflammatory cytokines (interleukin and tumor necrosis factor-mediated signaling pathways) are regulated by the expression of immunerelated microRNAs and associated with colorectal tumor progression, which can be considered as biomarkers of colorectal cancer (immune microenvironment of colorectal tumors: involvement of immune genes and microRNAs belonging to the TH17 pathway). Moreover, IMRS3 expression was significantly associated with CD8 expression in the IHC data of our clinical samples. Therefore, we believe that immune-related microRNAs can act as an accurate prognostic marker for thymoma in the future.
CD8 is a marker of killer T lymphocytes and CD68 is a marker of tumour-associated macrophages (TAM). CCL18 is primarily produced and secreted by the innate immune system (including TAM) and affects primarily the adaptive immune system (lymphocytes). These three interact to ultimately recruit Th2 and regulatory T cells, which mediate immunosuppressive effects; and secrete a variety of cytokines, growth factors (e.g. EGF, VEGF, PDGF, FGF and TGFβ), which interact with and influence multiple cell types in the tumor microenvironment and, in turn, mediate complex effects. Thus, the triad of CD8, CD68 and CCL18 clearly influences tumor prognosis, which corroborates our findings in the cox regression analysis.
Our study still has some limitations. First, we used the MCPcounter method to analyze immune cell abundance, which simply utilized the geometric means of the gene tag expression values of 10 immune cells to indirectly measure the abundance of infiltration, which may not represent the physical meaning of the actual infiltration ratio. Second, although the immune-related RNAs we proposed show great predictive value for the prognosis of thymoma, more in-depth studies are highly warranted to validate this finding. In addition, the immune-related microRNAs we identified in the present study need to use scRNAseq data to determine in a more accurate way whether expression of thymoma epithelial cells and microenvironmental macrophages via exosomes and other methods lead to an immunosuppressive environment. The specific mechanism needs to be further investigated. Despite these shortcomings, our research still has important scientific implications. For the first time, we propose that immune-related microRNAs play a critical role in thymoma prognosis and have great potential for clinical use.
Immune-related microRNAs may serve as essential biomarkers for predicting the prognosis of thymoma, and they may affect the prognosis by modulating the immune microenvironment. Our efforts provide a promising approach for accurate judgment of thymoma progression, leading to a new research interest that immune-related microRNAs that mediate the immune microenvironment may contribute to poor prognosis and even pathogenesis of thymoma.

Dataset retrieval and preprocessing
Gene expression values, as well as phenotype files and survival data for 122 thymomas, were downloaded from UCSC (https://xenabrowser.net/datapages/), and the expression matrices were in a format of log2(x+1)transformed RSEM normalized counts. Genes with 0 count more than 20% of the samples were filtered, and 16,552 genes were included. Additionally, we also downloaded microRNAs expression matrices for 126 thymomas from UCSC in log2(RPM+1) format, and 411 microRNAs were available for analysis after filtering the microRNAs with NA-containing expression values. Accession ID numbers of microRNAs were transformed with mature microRNA names using the miRBaseConverter package [17], and a total of 408 annotated microRNAs were finally included.

Evaluation of the abundance of tumor-infiltrating immune cells
The abundance of 8, 22, and 10 immune cells was analyzed using the EPIC, CIBERSORTx, and MCPcounter methods, respectively [18,19]. For EPIC, count values were used as imported data to obtain the abundance of 8 cell populations (including CD4 T cells, CD8 T cells, NK cells, and macrophages) via deconvolution. Similar to EPIC, CIBERSORTx also utilized deconvolution to obtain 22 immune cell abundances. In the MCPcounter process, the infiltration abundance of 10 immune cells was indirectly measured with the geometric mean of their tag expression values.

Identification of microRNA-mRNA pairs influencing macrophage function on relapse-free survival
In order to screen out the microRNAs and gene mRNAs that may indirectly affect RFS by affecting tumor immune cell function, a similar method for constructing T cell depletion signatures proposed by Jiang [20] was used. Subsequently, a prognostic microRNA-mRNA regulatory network for the identification method was constructed using the microRNA-mRNA target genes.
By univariate Cox regression, "macrophage" abundance obtained by EPIC was found to be a significant prognostic factor for PFI (β = 90.08, P = 0.0113). Given this, regression analysis of the 408 microRNAs was performed using macrophage abundance, microRNA expression values, and Cox regression of interaction terms between the two, and mRNA expression values of the 16,552 genes underwent Cox regression in a similar manner. MicroRNA species and mRNA genes with macrophage abundance, expression values, and interaction term regression coefficients Wald P < 0.05 were extracted as candidate genes.
The correlation of the expression values of these candidate genes was further analyzed with the R package anamiR function "negativecor" using the Pearson correlation coefficient method. MicroRNA-mRNA target gene pairs with r < −0.5 were included in the "databasesupport" function [21] to obtain microRNA-mRNA target gene pairs predicted by the miRNA database, and the microRNAs whose interaction with mRNAs was supported by at least three datasets were finally selected as candidates for weighted linear modeling. Penalized Cox regression was performed using LASSO to obtain a PFI prognostic signature containing four microRNA species, which was named the immune-related microRNAs score (IMRS). In order to make validation more feasible in a clinical setting, a reduced model incorporating the first three microRNAs (IMRS3) was calculated in comparison to the entire model (IMRS4) in terms of PFI prognostic performance. Fivefold cross-validation was conducted to select the lamda using the function cv.glmnet [22]. IMRS was calculated by summing the expression levels of microRNAs weighted with the corresponding non-zero coefficients.

Characterization of the immune microenvironment highly infiltrated by M2 macrophages
To further reveal the correlation of IMRS with M2 macrophages and T cells, a gene signature peculiar to TREM2 hi macrophages was utilized [23], which consists of 11 genes (TREM2, SPP1, RNASE1, MT1G, SEPP1, FOLR2, NUPR1, KLHDC8B, CCL18, MMP12, and APOC2). The mean expression value of the 11 genes was taken as the TREM2 hi score.
The correlation between the 11 genes, TREM2 hi score, infiltrating immune cell abundance, and IMRS was analyzed using the Pearson method. In order to further explore the effect of TREM2 hi score on tumor immune microenvironment, 120 thymoma samples were divided by the median TREM2 hi score into high and low expression groups, and differentially expressed genes were obtained using "limma" package "voom" and "lmFit". The enrichment analysis of the Reactome gene set was done using the clusterProfiler package "gesPathway" function with the t as the test variable [24], and a corrected P < 0.05 was taken as the significantly enriched pathway.

Statistical analysis
All microRNA expression levels were transformed logarithmically and expressed as median values and interquartile ranges. The differences in microRNA expression among the WHO classifications, between Masaoka stage I-II and Masaoka stage III-IV subgroups, or between high and low expression subgroups of CD8, CD68, and CCL18 were evaluated by Kruskal-Wallis test or Mann-Whitney U test. The χ 2 test or Fisher's exact probability was applied to analyze the correlation between CD8, CD68, CCL18, and clinicopathological factors. The RFS of different subgroups was compared using the Kaplan-Meier method and the log-rank test, and prognostic factors for RFS and OS were identified in univariate and multivariate Cox proportional hazard regression models. In particular, a likelihood ratio test was used to determine whether a covariate was significantly entered into the regression model. Two weighted linear models were developed, one of which was based on the biomarkers in question (miR-130b-5p, miR-1307-3p, miR-425-5p, CD8, CD68, and CCL18), i.e., BRRS, and the other was based on a combination of BRRS and clinical features, i.e., combined risk scores (CRS). The predictive efficiency of Masaoka staging, BRRS, CRS, and other individual biomarkers for 3-and 5-year RFS and OS was determined with timedependent ROC curve analysis by using the function "timeROC" and the function "compare" implemented in the R package. All other statistical analyses were performed using the SPSS 17.0 software (IBM SPSS, Chicago, IL, USA). All tests were bilateral, and P < 0.05 was considered statistically significant.

Ethical disclosure
The study was authorized by the ethics Committee of Army