RUNX2 and LAMC2: promising pancreatic cancer biomarkers identified by an integrative data mining of pancreatic adenocarcinoma tissues

Pancreatic carcinoma (PC) is a severe disease associated with high mortality. Although strategies for cancer therapy have made great progress, outcomes of pancreatic carcinoma patients remain extremely poor. Therefore, it is urgent to find novel biomarkers and therapeutic targets. To identify biomarkers for early diagnosis and therapy, three mRNA microarray datasets and two miRNA datasets were selected, and combinative analysis was performed by GEO2R. Functional and pathway enrichment analysis were performed using DAVID database. MiRTarBase, miRWalk and Diana Tools were used to get key genes. TCGA, HPA and western blotting were used to verify diagnostic and prognostic value of key genes. By integrating mRNA and miRNA expression profiles, we identified 114 differentially expressed genes and 114 differentially expressed miRNAs, respectively. Then, three overlapping key genes, RUNX2, LAMC2 and FBXO32, were found. Their protein levels in pancreatic tissue from PC patients and normal people were analyzed by immunohistochemical staining and western blotting. RUNX2 showed a potential property to identify PC. Aberrant over-expression of LAMC2 was associated with poor prognosis of PC patients, tumor status and subtypes. In summary, our current study identified that RUNX2 and LAMC2 may be promising targets for early diagnosis and therapy of PC patients.


INTRODUCTION
Pancreatic carcinoma (PC) is one of the most malignant types of cancer in the world. Although considerable advancements in diagnostic and pharmacological therapy have been achieved, the 5-year overall survival (OS) rate of PC is still less than 9% [1]. Thus, novel diagnostic and prognostic biomarkers are urgently needed to improve individualized survival.
Microarray is a high throughput technique used to analyze genetic changes. It has revolutionized studies on RNA and DNA and has also been extensively implemented as an efficient tool for the exploration of genes and biological processes in human diseases. Several studies have reported the differentially expressed genes (DEGs) and differentially expressed microRNA (DEMs) of PC in recent years [2][3][4]. However, there remain unanswered questions about the interaction between DEGs and DEMs during the progression of PC. And it's necessary for uncovering the real target genes.
In this work, we applied three mRNA profiling datasets (GSE62165 [5], GSE15471 [6] and GSE32676 [7]) and two microRNA (miRNA) profiling datasets (GSE24279 [8] and GSE32678 [7]), which were downloaded from the Gene Expression Omnibus (GEO) database, to obtain DEGs and DEMs between PC and normal tissue samples. Subsequently, DEMs targeted genes (DEMTGs) were predicted, and by overlapping analysis between DEMTGs and DEGs to filter potential genes and microRNAs (miRNAs). We further clarified The GEO2R web tool was used to screen for DEGs and DEMs between the PC and normal tissue samples in each dataset. The log2FoldChange (logFC) was calculated and the P-values were adjusted to correct for the occurrence of false-positive results by using the default method. Then, P-value < 0.05 and |log2FC| > 1.5 as the cut-off criteria for significant DEGs, and P-value < 0.05 and |log2FC| > 0.5 for significant DEMs was settled. Subsequently, volcano plots of DEGs and DEMs were used to quickly identify differences and Venn analysis was performed to get overlapping up or down regulated DEGs/DEMs in all mRNA/miRNA datasets, respectively [9]. By adding common shared up regulated DEGs/DEMs with down regulated DEGs/DEMs, obtained datasets were named as tDEGs and tDEMs in this study.

Survival analysis
To evaluate the prognostic value of key genes and miRNAs, the overall survival (OS) of PC patients was investigated based on the online database, Kaplan-Meier Plotter (http://kmplot.com/analysis/index.php) [14]. High-and low-level cohorts of the indicated genes were set according to the median value of gene expressions. The OS plot was obtained with the hazard ratio (HR), the 95% confidence interval (CI) and logrank P-value information.

Key gene expression analysis in TCGA and genotype-tissue expression (GTEx) tissues
Gene Expression Profiling Interactive Analysis (GEPIA [15], http://gepia.cancer-pku.cn/) is an interactive web server for estimating the RNA-Seq expression data from the TCGA (https://www.cancer.gov/about-nci/ organization/ccg/research/structural-genomics/tcga) and GTEx (https://www.gtexportal.org/) databases. The expression of key genes was analyzed in various tumor and non-tumor tissues using GEPIA, and the comparison of the key gene expression at different stages of PC was also performed. The clinical information of patients was also downloaded from TCGA for further data validation.

Data validation
Blood and saliva GEO data verification: We investigated a validation by comparing the mRNA standardized values of key genes in five independent GEO datasets. Four of them were blood samples (GSE74629 [16], GSE49641 [17], GSE49515 [18] and GSE15932) and the last one was saliva samples (GSE14245 [19]). Immunohistochemistry (IHC) verification: The IHC staining images of PC and normal tissue, which obtained from the Human Protein Atlas [20] (HPA, https://www.proteinatlas.org), were used as validation of key genes. The staining, intensity, quantity and location of IHC images present in HPA were calculated for each gene. (For the more details about statistical methods of the IHC images, the reader is referred to the web page https://www.proteinatlas.org/about/assays+ annotation#ih_annotation).

Statistical analysis
The discriminative ability of key genes and miRNAs in the GEO datasets was calculated by ROC analysis, the pROC [22] R package was performed in R 3.6.2 (http://www.R-project.org/). The area under the curve (AUC), 95% CI and P-value for assessing the predictive accuracy and specificity of ROC were calculated by SPSS version 23.0 (IBM, Chicago, IL, USA). All scatter and bar plots were generated by Graph Pad Prism 7 Software (GraphPad Software, Inc.). Comparisons among means were performed by ANOVA. P-value < 0.05 was assessed as statistically significant.

Initial identification of tDEGs, tDEMs and tDEMTGs in pancreatic cancer
To find key genes that were differentially expressed between PC patients and healthy people, we decided to filter them both ways: mRNA and miRNA targeted method. Firstly, we selected three expression array profiling datasets (GSE62165, GSE15471 and GSE32676) and two non-coding RNA array profiling datasets (GSE24279 and GSE32678) from GEO database. As for mRNA, 1398, 607 and 1157 DEGs were extracted from GSE62165, GSE15471 and GSE32676. Among them, there were 997, 562, 669 (103 shared genes) upregulated and 401, 45, 488 (11 shared genes) downregulated DEGs were identified. Taking advantage of Venn analysis in up or down regulated DEGs respectively, we captured 114 total-shared DEGs (tDEGs). Among them, there were 103 up-regulated and 11 down-regulated genes ( Figure 1A, 1B). As for miRNA, the same operations were also performed. 417 and 569 DEMs were extracted from GSE24279 and GSE32678. Two miRNA datasets shared 66 total DEMs (tDEMs), including 24 up-regulated and 42 downregulated miRNAs ( Figure 1C, 1D).
Target genes for tDEMs were predicted by miRTarBase, miRWalk and Diana Tools databases.
Then 114 consistent genes, which we termed as tDEMTGs, were found by Venn's analysis ( Figure 1E).

GO and pathway enrichment analysis
The GO term and KEGG pathway analysis were performed via DAVID. First, the results of the GO enrichment analysis varied a lot between tDEGs and tDEMTGs ( Figure 2A, 2B). Biological process (BP) analysis of GO showed tDEGs were significantly enriched in extracellular matrix associated part, such as cell adhesion, extracellular matrix organization and disassembly, collagen catabolism and organization. As for tDEMTGs, top 2 clusters of genes were enriched in the regulation of transcription and proliferation, which showed a similar result in tDEGs, but other genes were mainly responsible for the regulation of cell death associated part. For GO cell component (CC) analysis, the tDEGs were significantly enriched in extracellular part, such as extracellular region, extracellular exosome, extracellular space, extracellular matrix and proteinaceous extracellular matrix. However, the results of tDEMTGs were focused on plasma membrane part, cytosol, organelle lumen, membrane-enclosed lumen and intracellular organelle lumen. The same differences also showed in molecular function (MF) analysis, tDEGs were mainly enriched in protein binding, but tDEMTGs were enriched in nucleotide binding, transcription regulator activity, ribonucleotide binding, purine ribonucleotide binding and purine nucleotide binding.
Additionally, the differences between tDEGs and tDEMTGs were also confirmed by KEGG pathway analysis. The result disclosed that tDEGs were involved in PI3K/Akt signaling pathway, focal adhesion, ECMreceptor interaction, amoebiasis and pathways in cancer ( Figure 2C). However, tDEMTGs were significantly enriched in pathways in cancer and partial enriched in pancreatic cancer ( Figure 2D).

Identification of key genes and miRNAs
To identify potential key genes, tDEMTGs were aligned with tDEGs to get the cross-section, which we termed as key genes for further analysis ( Figure 1F). Importantly, the result showed that runt related transcription factor 2 (RUNX2), laminin subunit gamma-2 (LAMC2) and F-box protein 32 (FBXO32) were commonly shared and then listed in Table 1. Subsequently, we analyzed candidate miRNAs involved in the regulation of key genes from 66 tDEMs, and found 16 up-regulated and 19 down-regulated key miRNAs (Table 2).

Diagnostic and prognostic values of key genes
We then compared the transcriptional levels of key genes in several cancers with those of normal samples using TCGA database via GEPIA ( Figure 4A-4C). The mRNA expression levels of RUNX2, LAMC2 and FBXO32 showed significant differences between cancer and adjacent non-cancerous tissues especially in pancreatic adenocarcinoma (PAAD, Figure 4D). ROC and survival analysis were also performed to ensure the application prospect of key genes. All the three key genes indicated a good discriminating ability, of which the mean AUC values of each key gene in three GEO datasets were all greater than 0.85 ( Figure 4E). Furthermore, the mean AUC values of LAMC2 and  AGING FBXO32 were both greater than 0.95 which indicated that these two molecules might be promising biomarkers for PC diagnosis. The results of survival analysis showed that only the LAMC2 expression level affected OS rate significantly (HR=3.06, P=0.00011, Figure 4F). It suggested that LAMC2 could be an important molecule which participate in the development of PC and also act as a potential prognostic biomarker.

Verification of key genes
In addition to high accuracy and specificity, the diagnostic methods should also be painless and woundless. Blood and saliva are easily obtained specimens that we can get with little body damage. Thus, we verified the diagnostic possibility of key genes in 4 blood GEO datasets (GSE74629, GSE49641, GSE49515 and GSE15932) and 1 saliva GEO dataset (GSE14245). Interestingly, we found only RUNX2 had significant reduction compared with normal controls in saliva and blood cells (GSE14245 and GSE49641), which suggested that RUNX2 could be considered as one of promising candidate biomarkers of PC. LAMC2 and FBXO32 had no significant differences between the control and PC samples.
Then, to further clarify the expression of RUNX2 and LAMC2 in PC tissues, the results of IHC images of these two genes in HPA were explored. (Figure 5B and Supplementary Figure 2). First, the expression of LAMC2 was consistent with previous analysis. The staining of anti-LAMC2 antibodies were all above medium level in PC patients, which were barely detected in normal pancreas tissues. Although staining quantity were varied (5.5/12 were >75%, 2.5/12 were 75%-25% and 4/12 were <25% stained), almost all of their staining intensity were strong. However, the result of RUNX2 was not corroborated with predicted. Though a slight up-regulation of RUNX2 could be found in some samples (17%), most could not be detected. In addition, our immune blot results were consistent AGING with IHC experiment ( Figure 5C). These data indicated that LAMC2 may play a crucial role in the PC therapy.
In a second validation study, we explored the expression of RUNX2 and LAMC2 in different clinical stage, sex, age, tumor subtype and personal tumor status groups ( Figure 6A-6E). We found RUNX2 was only significantly up-regulated in advanced stage groups.
However, LAMC2 were significantly elevated not only in advanced PC patients, but also in the ductal type PC group (P=0.03) and PC patients which with tumor group (P=0.02).
According to all the above evidence, we found LAMC2 could be the major factor influencing the median survival time of PC patients through combinatorial AGING analysis of RUNX2 and LAMC2 ( Figure 6F and Supplementary Tables 1, 2).

LAMC2 and RUNX2 regulate PC cell growth and migration
For this, we knocked down LAMC2 or RUNX2 using shRNAs in ASPC-1 cells (Figure 7A, 7B) and assessed the subsequent cell growth ( Figure 7C, 7D) and migration ( Figure 7E). We found that knockdown of LAMC2 or RUNX2 both inhibited PC cell growth and migration significantly. Then, we measured changes of PI3K/AKT and MAPK pathways ( Figure 8A, 8B) since their vital roles in PC progression ( Figure 2C, 2D). Knockdown of RUNX2 significantly inhibited phosphorylation of AKT, which implied an important AGING role of RUNX2/AKT in the progression of PC. Knockout LAMC2 had no significant effect on the PI3K/AKT and MAPK pathways, but significantly reduced the expression of vimentin which is upregulated during epithelial-mesenchymal transition (Supplementary Figure 3).

DISCUSSION
Recently, the increasing morbidity and mortality of PC has made it a serious challenge for human health. However, benefiting from the development of bioinformatics technology, it is much easier for us to identify the general genetic alterations in diseases now, which may shed light on the determination of hub targets for clinical utility. Studies of miRNA biology have expanded considerably since first discovery, and have been considered as attractive targets because of their crucial roles in modulation of gene expression under healthy, inflamed and carcinogenic pathological state. Therefore, analysing gene expression changes in combination with miRNAs is vital for identifying general diagnostic and prognostic biomarkers in cancer.

AGING
In the present study, we initially found 114 tDEGs and 114 tDEMs from GEO datasets respectively. By GO function and KEGG analysis, we gained a deep understanding of these genes attached to the initiation and progression of PC. Further, to study the general profiles of molecular alterations in PC, we then identified 3 key genes and 35 important miRNAs by aligning tDEMTGs with tDEGs. We investigated diagnostic and prognostic value of them by ROC and survival tests. And we also confirmed protein expression through IHC and WB assays. Based on the data from this study, we suggested that LAMC2 and RUNX2 could serve as potential biomarkers for the clinical use of PC in the future.
MiRNAs are a class of non-coding RNAs which can bond to 3'-untranslated region (3'-UTR) of targeted mRNA to regulate their protein expression levels [23].
Multiple studies have demonstrated that miRNAs participate in the management of all cancer hallmarks. It is generally believed that miRNAs could be important molecular tools for non-invasive diagnosis and prognosis of cancer. Therefore, it is of great importance to identify most commonly suitable DEGs which could be used as treatment targets or diagnostic markers by discussing the interaction of miRNAs and mRNAs. In this study, we finally obtained 8 key miRNAs from candidate miRNAs by further screening through ROC and survival analysis, which enriched miRNA profiles of PC and may help to highlight the diagnostic and therapeutic potential of the miRNAs cluster in PC ( Figure 9).
RUNX2 belongs to RUNX family and is known as one of the main determinants of osteoblast differentiation and bone formation. Recent studies found RUNX2 was overexpressed in several tumor tissues and may play a vital role in tumor initiation, progression, invasion and metastasis [24][25][26][27][28]. This study demonstrated that although overexpression of RUNX2 was related to malignant behaviours, it didn't significantly affect the survival time of the PC patients which might be a bit inconsistent with previous research. Intriguingly, we found that RUNX2 presented a high diagnostic capability in PC. Histology is an invasive test that is inappropriate for the early diagnosis of PDAC, in terms of safety. Therefore, blood or body fluid tests can provide a good alternative method to assist clinicians with diagnosis. In contrast to the high expression of RUNX2 in pancreatic tissue, it was down-regulated in blood datasets, and its expression in peripheral blood mononuclear cell (PBMC, GSE49641) was significantly lower in PC group than in healthy control samples. Saliva samples (GSE14245) showed a similar result which suggests its applicability as a non-invasive diagnostic biomarker. However, there was no significant change in peripheral blood (GSE74629 and GSE15932). In addition, RUNX2 has been reported that could regulate many carcinogenesis genes and molecules in PC, including VEGF (vascular endothelial growth factor), matrix metalloproteinases and survivin [29]. Combined with the results of transcription factor prediction, we believed that the role of RUNX2 in PC progression might concentrate more on regulating the target genes, such as LAMC2.
LAMC2, a member of laminins family, is the main structural component of basement membranes and participates in a wide variety of biological processes including cell adhesion, differentiation, migration and metastasis [30,31]. In this study, the result of the enrichment analysis indicated that LAMC2 was closely related with the process of cell adhesion, extracellular matrix organization/disassembly and positive regulation of cell proliferation. We then found that the expression of LAMC2 was elevated in PC tumor tissues through the GEO database, IHC and WB analysis. Additionally, for PC patients with high LAMC2 expression, the prognosis was poor. According to these data, we hypothesized that the expression of LAMC2 may lead to an increased risk of tumor recurrence. Pancreatic ductal adenocarcinoma (PDAC) accounts for most AGING human PC cases (more than 95%) [32], we found that the expression of LAMC2 is mainly up-regulated in PDAC, which indicates that it is not only suitable for most PC patients, but may also help to identify the subtypes of PC. Because carbohydrate antigen 19-9 (CA19-9) which acts as the most commonly used PC marker is not accurate for the diagnosis of PDAC [33], combining with the good discriminating ability that LAMC2 showed in the AUC curve, we suggest that it may be a potential indicator in the auxiliary diagnosis of PC.
The PI3K/AKT and MAPK pathways are both important intracellular signal transduction cascades which could regulate cell growth and proliferation, survival and apoptosis, and mobility and invasion [34][35][36]. In tumorigenesis, dysregulation of PI3K/ AKT and/or MAPK pathways occurs in almost onethird of human cancers, especially PC [36][37][38][39][40][41][42]. RAS is a common signaling component of both the MAPK and PI3K/AKT signaling pathways. And PDAC is the predominant form of PC. Recently, it has been reported that KRAS is the most frequently mutated gene (~95%) in PDAC [43,44]. Consistent with previous research, in this study, there is a significant recruitment outcome for PI3K/AKT (for tDEGs) and MAPK signaling pathways (for tDEmiRs). Substantial efforts have been invested in developing the inhibitors of the PI3K/AKT and MAPK pathways; however, ascribed to complex crosstalk, clinical benefits are largely limited [43]. The interaction of RUNX2 and LAMC2 with the PI3K/AKT and MAPK signaling pathways can act as a driving force in controlling tumor progression [45,46]. As a downstream molecule of the PI3K/AKT and MAPK pathway, RUNX2 can regulate metastatic properties of human prostate cancer and breast cancer cells [46][47][48][49][50]. However, RUNX2 also activates the PI3K/AKT pathway by regulating its components [51,52]. Previous research has identified that the expression of LAMC2 depends largely on PI3K/AKT and MAPK pathways, and its overexpression also has a feedback regulation on these two pathways [53]. In non-small cell lung cancer cells, the inhibition of Akt upregulates LAMC2 expression, while high LAMC2 suppresses Akt signaling [53,54]. Degen et al. found that MAPK/ERK pathway hyperactivation as the driver of LAMC 2 overexpression by neoplastic cells, correlated with increased phosphorylation and activation of the translation factors S6 and eIF4B [55]. And overexpression of LAMC2 increased the protein expression of p38 but not ERK1/2, JNK or ERK5 [56]. In this study, we found that knockdown of LAMC2 or RUNX2 inhibited PI3K/AKT and MAPK/ERK pathways to a more or less extent, especially RUNX2. These may suggest directions for the discovery of the agents' combination that target PI3K/AKT and MAPK/ERK pathways in PC.
In summary, we confirmed that RUNX2 and LAMC2 are key genes that facilitate the progression of PC through bioinformatics and experimental analysis. Eight key miRNAs (miR-588, miR-199b, miR-1227, miR-628, miR-488, miR-1825, miR-519 and miR1265) which participated in the regulation of key genes expression could enrich the specific miRNA profiles of PC. These findings provide a new perspective on the underlying molecular mechanism of PC, suggesting that LAMC2 and RUNX2 may be valuable biomarkers and therapeutic targets for PC patients and may also offer powerful evidence and clues for the future genomic individualized treatment of PC.