Expression profile and prognostic values of GATA family members in kidney renal clear cell carcinoma

To investigate the possible diagnostic and prognostic biomarkers of kidney renal clear cell carcinoma (KIRC), an integrated study of accumulated data was conducted to obtain more reliable information and more feasible measures. Using the Tumor Immune Estimation Resource (TIMER), University of Alabama at Birmingham Cancer Data Analysis Portal (UALCAN), Human Protein Atlas (HPA), Kaplan-Meier plotter database, Gene Expression Profiling Interactive Analysis (GEPIA2) database, cBioPortal, and Metascape, we analyzed the expression profiles and prognoses of six members of the GATA family in patients with KIRC. Compared to normal samples, KIRC samples showed significantly lower GATA2/3/6 mRNA and protein expression levels. KIRC's pathological grades, clinical stages, and lymph node metastases were closely related to GATA2 and GATA5 levels. Patients with KIRC and high GATA2 and GATA5 expression had better overall survival (OS) and recurrence-free survival (RFS), while those with higher expression of GATA3/4/6 had worse outcomes. The role and underlying mechanisms of the GATA family in cell cycle, cell proliferation, metabolic processes, and other aspects were evaluated based on Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) enrichment analyses. Furthermore, we found that infiltrating immune cells were highly correlated with GATA expression profiles. These results showed that GATA family members may serve as prognostic biomarkers and therapeutic targets for KIRC.


INTRODUCTION
growth and the formation of a variety of tissues [7]. Previously defined as hematological (GATA1/2/3) and cardiac (GATA4/5/6) GATA family members based on investigations of their expression, their functions and expression patterns have been shown to be widespread beyond these organs [8][9][10]. GATA1 and GATA2 play pivotal roles in regulating the cell cycle and proliferation [11]. GATA4 and GATA5 transcription factors are increasingly recognized as playing a role in the carcinogenesis of human tumors of endodermal and mesodermal origin [12], while GATA6 is expressed in the immature proliferating cells in the intestinal crypts and is classified as a potential oncogene [13]. The occurrence and development of cancer is a complicated process [14,15]. Recent studies have indicated that the GATA family plays important roles in tumorigenesis, such as in lung squamous cell carcinoma [16], urothelial carcinoma [17], ovarian carcinoma [18], breast cancer [19], and gastric carcinomas [20], and the GATA family may serve as potential new biomarkers. Although KIRC-related genes and their potential biomarkers have been mentioned in a number of publications [21,22], the prognostic significance of the GATA family in the emergence of KIRC has not been thoroughly clarified. To improve therapeutic outcomes, we employed bioinformatic techniques and research databases to evaluate GATA expression in KIRC and investigate its prognostic relevance.

Aberrantly increased expression of GATA family members in patients with KIRC
The mRNA expression levels of GATA family members in KIRC and healthy tissues were assessed using the Tumor Immune Estimation Resource (TIMER) database. We found that GATA2/3/5/6 expression levels were considerably downregulated in patients with KIRC. However, the expression level of GATA1 and GATA4 was higher in KIRC tissues than in normal tissues ( Figure 1A). Then, we used the University of Alabama at Birmingham Cancer Data Analysis Portal (UALCAN) to compare the relative expression levels of GATA family members in KIRC. Notably, the results showed that the mRNA expression of GATA2/3/5/6 was lower in KIRC tissues than in normal tissues ( Figure 1B).
We carried out immunohistochemistry analysis of the protein expression of GATA family members utilizing Human Protein Atlas (HPA) databases to further assess and validate the protein expression levels of GATA family members in KIRC. According to Figure 2, the majority of GATA family members exhibited low or no expression in KIRC tissues but moderate to high expression in normal kidney tissues. Compared to that in the corresponding normal tissues, GATA1/2/3/6 protein expression was downregulated in KIRC tissues. In contrast, GATA4 was highly expressed in KIRC ( Figure 2). Moreover, the HPA database did not contain any IHC information about GATA5.

Correlation of the expression of GATA family members with clinicopathologic features of patients with KIRC
Next, the association between GATA expression and tumor stage in KIRC was investigated. GATA2/5/6 expression changed noticeably throughout the tumor stages, according to correlation analysis of TCGA data using the Gene Expression Profiling Interactive Analysis (GEPIA) database, whereas GATA1/3/4 expression showed no discernible variations among tumor stages ( Figure 3A). At the N0 and N1 phases of lymph node metastasis, we noticed that GATA3 mRNA expression levels in KIRC tissues were lower than those in normal tissues. Additionally, the mRNA expression levels of GATA2 and GATA5 tended to be lower in tumors with N0 and N1 stage lymph node metastases than in normal tissues and were substantially correlated with disease prognosis, as mentioned below. Conversely, tumors with N1 stage lymph node metastases tended to have the highest level of GATA6 mRNA expression ( Figure 3B).
Moreover, we analyzed correlations between the expression of GATAs and clinicopathological characteristics using TCGA samples from patients with KIRC (Table 1). The results showed that GATA2/3/4/6 expression levels were strongly correlated with the T stage of KIRC patients. Meanwhile, GATA2/5/6 expression levels were significantly associated with the N stage of the levels of GATA2 and GATA5 were significantly associated with the M stage of KIRC patients, and the levels of GATA2 and GATA5 were significantly associated with the M stage of KIRC patients. According to these findings, members of the GATA family could be used as potential diagnostic indicators of KIRC.

Prognostic value of the GATA family in patients with KIRC
Next, we examined the prognostic significance of GATA mRNA expression in patients with KIRC, including overall survival (OS) and recurrence-free survival (RFS), using the Kaplan-Meier plotter and the GEPIA2 databases. In patients with KIRC, we discovered that higher GATA2 and GATA5 expression levels were substantially correlated with longer OS and higher GATA3/4/6 expression levels were associated with poorer prognosis ( Figure 4A). Similarly, we found that in individuals with KIRC, higher GATA2 and GATA5 expression was substantially linked with better RFS ( Figure 4B). Other GATA factor mRNA expression levels showed no appreciable impact on OS and RFS in patients with KIRC. According to our findings, increased GATA2 and GATA5 expression was substantially linked with prolonged OS and RFS in patients with KIRC, suggesting that GATA2 and GATA5 are potential biomarkers for the prognosis of KIRC, with higher expression indicating better outcomes.
Using the Kaplan-Meier plotter database, we further analyzed the prognostic value of GATA family members in different clinical stages and pathological grades of KIRC (Table 2). Abundant expression of GATA2 was significantly associated with shorter OS in stage II but was significantly correlated with better OS in stages I, III, and IV KIRC. The mRNA expression of GATA3 was closely related to poorer OS in patients with stage III and grade III KIRC and correlated with longer OS in those with stage II KIRC. Moreover, GATA4 transcriptional expression was significantly associated with worse OS in patients with stage III and better OS in those with grade III KIRC. We also found that high expression of GATA1 and GATA6 was correlated with poor OS in patients with stage II and stage IV KIRC. When considered collectively, our findings indicate that several GATA family members are potential prognostic markers in KIRC and are particularly useful for predicting the OS of patients with KIRC. Utilizing the TCGA database and the online tool cBioPortal, the profiles of genomic changes for each GATA member are shown in Figure 5. Thirty-nine (9%) of the 446 enrolled individuals with KIRC had altered GATA family genes in total. Among the GATA family members, GATA 2/3/4 had the highest genetic alteration rate (2.2%), followed by GATA1 (1.3%),  Bold font indicates significant difference.
GATA6 (1.1%), and GATA5 (0.9%) ( Figure 5A). mRNA high and deep deletions were the two predominant genetic alteration types in the GATA family members. Rarely did the GATAs show amplification, missense, in-frame, or splicing mutations.
The DNA methylation levels of GATA family members in patients with KIRC were also detected through the UALCAN database. GATA1 had considerably lower DNA methylation levels in KIRC samples than in healthy human controls, while GATA2/3/4/5 showed significantly higher levels in KIRC tissues and GATA6 showed statistically negligible variations between normal and malignant tissues ( Figure 5B-5G).

Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis of protein-protein interactions (PPIs) of GATAs
A PPI network was constructed based on the top 277 genes that were co-expressed and related to the GATAs, which were identified using cBioPortal and Cytoscape (Supplementary Table 1). REN, BDNF, GAD, SLC16A9, PVALB, FTCD, and PLG had the highest likelihood of interacting with the GATAs and promoting KIRC development ( Figure 6A).
In addition, we used the 277 identified co-expressed genes to further analyze the potential function of GATAs in KIRC by analyzing their GO terms and KEGG pathways using the Metascape database. The KEGG results showed that the co-expressed genes were mainly related to starch and sucrose metabolism; alanine, aspartate, and glutamate metabolism; and protein digestion and absorption. Likewise, there was a connection between the GATAs and the complement and coagulation cascades ( Figure 6B). The enriched GO pathways for molecular function were secondary active transmembrane transporter activity, carboxylic acid transmembrane transporter activity, calcium ion binding, and omega peptidase activity ( Figure 6C). The top enriched pathways for biological process were sodium ion transport, organic anion transport, renal system processes, regulation of systemic arterial blood pressure by hormones, and carboxylic acid biosynthetic processes ( Figure 6D). Cellular component analysis revealed that these genes were frequently related to the  apical part of the cell, microvillus, basal part of the cell, and myosin complex ( Figure 6E).
In the TIMER database, we next searched for any connections between the expression of GATA family members and immunological signature markers of different immune cells infiltrating KIRC (  [25]. In addition, GATA5 suppressed cholangiocarcinoma cell growth and metastasis via the Wnt/β-catenin pathway [26]. Numerous studies have indicated aberrant expression of members of the GATA family in diverse types of tumors, suggesting their vital roles in tumorigenesis and cancer progression [27][28][29]. As far as we are aware, the GATA family's function in KIRC has not been systematically examined. Thus, the mRNA expression levels of GATA family members were analyzed in KIRC tissues and their levels were compared with those in healthy kidney tissues using the TIMER and UALCAN databases. In comparison to normal tissues, KIRC tissues showed lower expression levels of GATA2/3/5/6, indicating patients with KIRC had lower levels of the GATA2/3/6 proteins. Recent studies have indicated that GATA family members could be widely used as promising biomarkers for the clinicopathological diagnosis of various cancers. Satoshi et al. revealed that in urothelial carcinoma, GATA3 is one of the most useful markers in diagnostic surgical pathology and may serve as a reliable prognostic marker in patients with urothelial carcinoma [30]. Grainne et al. found that GATA6 regulates epithelial-mesenchymal transition and tumor dissemination and is a marker of adjuvant chemotherapy response in pancreatic ductal adenocarcinoma (PDAC) [31,32]. Moreover, Andrés et al. demonstrated that GATA4 is a potential marker of tumor growth in PDAC and that the expression of GATA4 and GATA6 is a biomarker of poor prognosis and therapeutic response [33]. The clinical association and prognostic significance of abnormally expressed GATAs in patients with KIRC were then investigated. We discovered a connection between KIRC clinicopathological staging and GATA2/5/6 expression levels. Moreover, the data demonstrated a strong correlation between lymph node metastasis and GATA2/3/5 mRNA expression levels in KIRC tissues. According to our research, GATA2 and GATA5 were associated with a better OS and RFS in patients with KIRC, and GATA3/4/6 overexpression was associated with a poorer prognosis. These findings indicate that these gene family members, particularly GATA2 and GATA5, have prognostic significance, great promise for patient prognosis, and tremendous potential as diagnostic markers in patients with KIRC.
The accumulation of genetic alterations and the resulting changes in gene expression patterns are regarded as the main forces behind tumor progression [34]. Patients with KIRC were discovered to have alterations in each of the six members of the GATA family, with a total genetic alteration rate of 9.9%. Additionally, DNA methylation contributes to the growth of tumors and is linked to levels of gene expression [35]. The involvement of GATA DNA methylation in malignancies may be deduced from a number of indicators. For instance, early gastric carcinogenesis frequently involves the epigenetic inactivation of GATA4 and GATA5 by CpG island associated co-expressed molecules that were most frequently altered in KIRC were identified using the cBioPortal database. The PPI network was generated from the GATA family members and their associated co-expressed genes, which was constructed using the Cytoscape database. (B-E) GO functional enrichment analysis and K EGG pathway analysis of GATA-associated co-expressed molecules were conducted using the Metascape database. KIRC, kidney renal clear cell carcinoma; PPI, protein-protein interaction; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes. AGING methylation, which is strongly linked with Helicobacter pylori infection [20]. Fu et al. demonstrated that GATA5 is rarely methylated in normal duct epithelium but is highly methylated in pancreatic cancer tissue [12]. The DNA methylation of GATA4 and GATA5 is a common and specific event in colorectal cancer. In vitro studies have shown that GATA4 and GATA5 have a tumor-inhibitory effect in colorectal cancer cells [36]. In this investigation, we demonstrated that the higher levels of DNA methylation in KIRC tissues may be the cause of the decreased expression levels of GATA2/3/5.
Then, 277 co-expressed genes and the molecular biological functions of GATA members were examined.  Correlation R value was calculated by Spearman's algorithm and adjusted by tumor purity. *P < 0.05, **P < 0.01, ***P < 0.001.
The modulation and function of the differentially expressed GATAs in KIRC were most closely associated with REN, BDNF, GAD, and SLC16A9, according to protein-protein network interactions. KEGG pathway analysis showed that complement and coagulation cascades were specifically related to GATAs. A system of plasma proteins called the complement cascade is triggered when infections are present. Complement activation can occur through the classical, lectin, or other routes. Each of these routes produces an essential enzyme activity that further induces effector molecules of the complement system. Complement activation has three major effects: it opsonizes pathogens, recruits inflammatory and immunocompetent cells, and directly kills pathogens [37]. Our findings imply that GATA-mediated signaling, via affecting the recruitment of immunocompetent cells, may play critical roles in antitumor immunity.
Although KIRC is a well-known heterogeneous disease, useful biomarkers that contribute to individualized treatment options are still lacking, especially for current immunotherapies [38]. Recent studies have shown that GATAs may serve as therapeutic targets in cancer immunotherapy. Fu et al. identified that GATA2 drives PD-L1 and PD-L2 expression, and PD-L2 correlated with worse clinical outcomes in patients with gliomas. Targeting GATA2 may help reduce the inhibitory effects of PD-L2 in the tumor microenvironment [39]. Moreover, tumor-associated macrophages (TAMs) possess great potential in affecting the development of ovarian cancer (OC). Chen et al. found that TAMderived extracellular vesicles allowed for the transfer of GATA3 into OC cells, which facilitated the immune escape of OC cells and their resistance to cisplatin [40]. Therefore, GATA3 might serve as a potential immunotherapeutic target for OC.
In this study, we discovered an astonishing relationship between the expression of certain GATA family members and the infiltration of six immune cell types.
As a result, we investigated the link between immune infiltration markers in KIRC and the expression of GATA family members. Interestingly, a number of immune cells showed substantial associations with the expression of GATA family members. These findings imply that the immunological state of KIRC may be reflected by the expression of GATA family members, which may also serve as targets for immunotherapeutic approaches in the future. However, our study has certain limitations. For example, when we analyzed the GATA family members in KIRC, we only used a series of websites or databases; further in vivo and in vitro experiments are needed to corroborate our findings.

CONCLUSIONS
In conclusion, using bioinformatics methods, we thoroughly examined the expression and predictive capacity of the GATA family members in patients with KIRC in an effort to further our knowledge of the critical involvement of these transcription factors in tumor development and immune responses in patients with KIRC. GATA2 and GATA5 may be novel predictive indicators and possible targets for the personalized treatment of these patients, according to our comprehensive bioinformatics investigation. However, further research is needed to assess the mechanism of their influence on tumor development/progression and identify new pharmacological therapies. The results in this study may help clarify the distinctive functions of GATAs in KIRC.

TIMER database
Based on the TCGA database, the TIMER database (https://cistrome.shinyapps.io/timer/) is a comprehensive resource that can evaluate immune cell infiltration and the clinical impact of 10,897 tumors from 32 different cancer types. Numerous features of TIMER include survival analysis, gene expression comparisons between tumor and normal tissues in various malignancies, and investigation of the relationships between genes and immune-invading cells [41,42]. The mRNA expression of GATA family members in different tumors or particular cancer subtypes from TIMER was examined in our study. Log-scale values were calculated as log2 [TPM (Transcripts per million)]. The expression of the GATA members and the infiltration of six immune cell types, including B cells, CD8 + T cells, CD4 + T cells, macrophages, neutrophils, and dendritic cells, in KIRC were also analyzed using the TIMER database.

GEPIA2 database
The GEPIA2 database (http://gepia2.cancer-pku.cn/) is an open access dataset that offers vital interactive and programmable features, such as differential expression, pathological stage, and patient survival analyses. To assess the relationships between GATA family expression and the clinical stage and RFS of patients with KIRC, we employed the GEPIA2 database. According to the median expression of single GATA family members, the patients with KIRC were divided into low and high GATA family member expression groups.

The HPA
The HPA (https://www.proteinatlas.org/) is an online database that contains immunohistochemistry-based expression data for various cancer types [43,44]. In this investigation, we used immunohistochemical images to examine the levels of protein expression of several GATA members in KIRC tumors and normal kidney tissues.

UALCAN database
The UALCAN database (http://ualcan.path.uab.edu/) is an interactive web resource based on RNA-sequence levels and clinical data from 31 cancers in the TCGA database [45]. UALCAN was applied in this study to determine the mRNA expressions of GATA family members in KIRC tissues and their correlation with nodal metastatic status. Additionally, utilizing the UALCAN database, we forecasted DNA methylation alterations in the GATA family members in KIRC tissues.

Kaplan-Meier plotter database
To assess the relationships between GATA family expression and the OS of patients with KIRC, we employed the Kaplan-Meier plotter database (https://kmplot.com). According to the median expression of single GATA family members, the patients with KIRC were divided into low and high GATA family member expression groups. Kaplan-Meier analysis was assessed by log-rank tests, and statistical significance was considered at p < 0.05.

cBioPortal
Cancer genomes and clinical data were analyzed using the cBioPortal platform (https://www.cbioportal.org/) [46]. The genomic map of the GATA family, which contains information on mutations and mRNA expression, was examined in this study. The threshold of |log2FC| was 1, and the p-value cutoff was 0.01.

STRING
The STRING database investigates possible protein interaction networks. The GATA family of genes and related genes were used by STRING to construct the PPI network.

Cytoscape
In this study, 277 co-expressed molecules of the GATA family members that were identified through cBioPortal underwent functional integration using the Cytoscape platform (Supplementary Table 1).

Metascape
Metascape (http://metascape.org) is a complete, potent, adaptable, and interactive set of web-based analytic tools [47]. We used this platform to carry out GO and KEGG enrichment analyses.

Statistical analysis
SPSS (version 26.0) was used to conduct statistical analysis on the relationships between the clinicopathological characteristics of patients with KIRC and the mRNA expression of GATA family members. Student's t-tests were used for comparisons and p-values less than 0.05 were considered statistically significant.

Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding authors.

AUTHOR CONTRIBUTIONS
Xuejie Yang and Cheng Mei conducted experimental operations, sample processing, data analysis, and performed the experiments. All authors participated in writing the paper. Xiaoyun He and Chunlin Ou conceived and designed the experiments. All authors read and approved the final manuscript.

CONFLICTS OF INTEREST
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

FUNDING
This study was supported by the National Natural Science Foundation of China (81903032), the China