KNTC1 and MCM2 are the molecular targets of gallbladder cancer

Background: Gallbladder carcinoma is a malignant epithelial tumor of gallbladder with a high degree of malignancy. However, relationship between KNTC1 and MCM2 and gallbladder cancer is unclear. Methods: GSE139682 and GSE202479 were downloaded from gene expression omnibus (GEO). Differentially expressed genes (DEGs) were screened. Functional enrichment analysis and gene set enrichment analysis (GSEA) were performed. Protein-protein interaction (PPI) Network was constructed and analyzed. Gene expression heat map was drawn. Comparative toxicogenomics database (CTD) analysis was performed to find diseases most related to core genes. TargetScan was performed for screening miRNAs that regulated central DEGs. Results: 230 DEGs were identified. According to GObp analysis, they were mainly concentrated in regulation of ossification, regulation of spindle microtubule and centromere attachment, cytoskeleton tissue of cortical actin. According to GOcc analysis, they are mainly concentrated in plasma membrane part, cell junction, plasma membrane region and anterior membrane. According to GOmf analysis, they are mainly enriched in protein homodimerization activity, proximal promoter sequence-specific DNA binding and sulfur compound binding. KEGG showed that target genes were mainly enriched in Hippo signal pathway, p53 signal pathway and cancer pathway. KIFC2, TUBG1, RACGAP1, CHMP4C, SFN and MYH11 were identified as core genes. Gene expression heat map showed that KNTC1, MCM2, CKAP2, RACGAP1, CCNB1 were highly expressed in gallbladder carcinoma samples. CTD analysis showed that KNTC1, MCM2, CKAP2, RACGAP1, CCNB1 were associated with head and neck squamous cell carcinoma, necrosis, inflammation and hepatomegaly. Conclusions: KNTC1 and MCM2 are highly expressed in gallbladder cancer. Higher expression level correlates with worse prognosis.


INTRODUCTION
Gallbladder carcinoma (GBC) is a kind of hepatobiliary malignant tumor developed from the intima of gallbladder mucosa [1,2]. Although it is generally considered to be rare, it is the most common biliary malignant tumor, accounting for 80% Mel 95% of biliary cancer. The survival rate is low, which is mainly due to late diagnosis. Most symptomatic patients find incurable tumors with poor clinical outcomes [3]. The unique combination of inducing factors, including genetic susceptibility, geographical distribution, female bias, chronic inflammation and congenital dysplasia, makes this type of cancer unique. Epidemiological studies have found that there are amazing geographical and ethnic differences among American Indians. The incidence is very high in American Indians, high in Southeast Asia, but very low in the Americas and other parts of the world. Age, female, congenital biliary tract abnormality and genetic susceptibility are all important and unalterable risk factors [4]. The study found that incidence of gallbladder cancer in women is three to six times higher than men. In addition, incidence of gallbladder cancer continues to increase with age [5]. More than 2/3 of patients with gallbladder cancer were over 65 years old [6]. At present, the most effective treatment for GBC is surgery. However, due to the early asymptomatic characteristics and the concealment and rapid progression of the disease, a small number of patients are suitable for surgery. Chemotherapy, targeted therapy and immunotherapy are also used [7,8]. However, because the cause of gallbladder cancer is not clear, the application of these treatments is still limited. Therefore, it is particularly important to study molecular mechanism of gallbladder cancer.
As an important part of the development of life science, bioinformatics has been at the forefront of life science and technology research. In recent years, China's biotechnology has developed by leaps and bounds, and bioinformation resources have also grown explosively. Bioinformatics reveals the biological significance represented by big data, which is a bridge between data and clinic. Represented by the analysis and reporting of gene detection data, bioinformatics plays a role in the tumor treatment [9,10].
However, the relationship between KNTC1, MCM2 genes and gallbladder cancer is not clear. Therefore, the paper intends to use the bioinformatics technology to explore and analyze core genes between gallbladder cancer and normal tissues. The public datasets were used to validate roles of KNTC1 and MCM2 in gallbladder cancer. And it was verified by basic cell experiment.

Screening of DEGs
230 DEGs were identified according to the matrix of GSE139682 and GSE202479 ( Figure 1A is the result of GSE139682, Figure 1B is the result of GSE202479, Figure 2).

Functional enrichment analysis DEGs
We analyzed DEGs by GO and KEGG. According to GObp analysis, they were mainly concentrated in the regulation of ossification, the regulation of spindle microtubule and centromere attachment, and the cytoskeleton tissue of cortical actin ( Figure 3A). According to GOcc analysis, they are mainly concentrated in the plasma membrane part, cell junction, plasma membrane region and anterior membrane ( Figure 3B). According to GOmf analysis, they are mainly enriched in protein homodimerization activity, proximal promoter sequence-specific DNA binding and sulfur compound binding ( Figure 3C). KEGG analysis showed that target genes were mainly enriched in Hippo signal pathway, p53 signal pathway and cancer pathway ( Figure 3D).

GSEA
GSEA was performed to search for possible enrichment items among non-differentially expressed genes, and results of DEGs were verified. The intersection of enrichment items and GOKEGG enrichment items of differentially expressed genes is shown in the figure, which is mainly concentrated in Hippo signal pathway, p53 signal pathway and cancer pathway. ( Figure 4A, 4C, 4E, 4G are GSE139682 results, while Figure 4B, 4D, 4F, 4H are GSE202479 results).

Metascape enrichment analysis
In the enrichment project of Metascape, GO has the regulation of supramolecular fibrous tissue, norepinephrine metabolism and T cell migration ( Figure 5A), and an enrichment network stained by enrichment term and p-value ( Figures 5B, 5C, 6), which visually shows the correlation and confidence of each enrichment item.

PPI network
The PPI network was constructed by STRING and analyzed by Cytoscape ( Figure 7A), the core gene cluster is obtained ( Figure 7B). The hub genes were identified using two different algorithms ( Figure  At the same time, we also use the Metascape website to output the protein interaction network, and identify the core module to verify the PPI network results in STRING. Among them, KIFC2, TUBG1, RACGAP1, CHMP4C, SFN and MYH11 genes were identified as core genes.

Gene expression heat map
We obtained a visual differential heat map of core genes between gallbladder carcinoma and normal samples. We found that five core genes (KNTC1, MCM2, CKAP2, RACGAP1, CCNB1) were highly expressed in gallbladder carcinoma samples and low in normal samples. ECT2L, MELK, SPAG5, KIF23 and CHAF1B genes may play a regulatory role in gallbladder carcinoma ( Figure 9A is the result of GSE139682, Figure 9B is the result of GSE202479).

CTD analysis
Core genes were entered into CTD to find diseases related to core genes. Five genes (KNTC1, MCM2, CKAP2, RACGAP1, CCNB1) were found to be associated with head and neck squamous cell carcinoma, necrosis, inflammation and hepatomegaly ( Figure 10).

DISCUSSION
Gallbladder cancer is a highly malignant disease with poor prognosis, high invasion and metastasis rate and mortality. GBC is the most invasive biliary tract cancer with the shortest median survival time [11,12]. The available treatment options vary widely in areas with a high prevalence of gallbladder cancer, resulting in different outcomes for patients in different regions. Although treated in the most advanced areas of medicine, malignant tumors of gallbladder are highly fatal. Only about 1/5 of gallbladder cancer cases are found when the disease is still confined to the gallbladder, according to the American Cancer Society. This greatly limits the choice of therapeutic treatment and reduces the overall survival rate [8,13,14]. The exact molecular mechanism of gallbladder cancer is not clear. Although many studies have reported key genes related to gallbladder cancer, few studies have proved that these loci have been well applied to the diagnosis and treatment of gallbladder cancer. Many genes and molecules play a key role in pathophysiological processes, and understanding their interactions is important for the molecular mechanism of gallbladder cancer [15]. At present, many bioinformatics techniques can explore the intratumoral heterogeneity and cancer progression. The main result of this study is that KNTC1 and MCM2 genes are highly expressed in KNTC1 encodes a mitochondrial component in the Rod-Zwilch-ZW10 (RZZ) complex, which is the key to sister chromatid separation and participation in spindle checkpoints during mitosis. Kinetochore-associated proteins are key components of mitotic checkpoints and are essential for faithful chromosome separation and spindle assembly during cell division [16][17][18]. Recent studies have shown that KNTC1 may be a potential biomarker for promoting the occurrence and development of human malignant tumors [19]. Chromosome segregation and cell division are key biological processes, in which many evolutionarily conserved protein complexes. These proteins are overexpressed in malignant tumors, and some of them have even been identified as oncogenes [20]. Recent advances have shown that kinetochore-associated proteins are up-regulated and play a role in     carcinogenesis of many types of cancer. The literature shows that KNTC1 may affect the biological activity of hepatocellular carcinoma cells through PI3K/Akt signal pathway, and high KNTC1 expression is associated with the poor prognosis [19]. KNTC1 can be used as a tumor promoter in the occurrence and development of non-small cell lung cancer by colony formation, cell migration and inhibiting apoptosis. It is worth noting that KNTC1 may regulate non-small cell lung cancer through its downstream target PSMB8 [21]. In addition, it is reported that KNTC1 is associated with esophageal squamous cell carcinoma. The down-regulation of KNTC1 expression inhibits cell viability and induces apoptosis in the ESCC cell line [22]. Zhang et al. found that KNTC1 was up-regulated in colon cancer compared with normal tissues. The high KNTC1 expression was associated with the poor prognosis [18]. These studies suggest that kinetin can be used as a potential biomarker for early diagnosis of cancer. Similarly, we also found that KNTC1 gene is highly expressed in gallbladder cancer, higher the KNTC1, worse the prognosis.
MCM2 encodes a 904 amino acid protein with a molecular weight of 101896Da. MCM2 is a member of microchromosome maintenance protein family, regulates DNA replication and cell cycle by participating in formation of replication initiation complex [23][24][25]. Previous studies have shown that inhibition of MCM2 reduces cell viability and aggravates apoptosis in cellular models of Alzheimer's disease [26]. The MCM2 protein is overexpressed in the nucleus of high malignant tumors, which is related to the late stage, late stage and poor prognosis of the tumor. Gulinisha Aihemaiti found that MCM2 was highly expressed in the ovarian clear cell carcinoma, and the expression of MCM2 was mainly confined to nucleus [27]. In addition, Deng et al. reported that knockdown of MCM2 can improve chemoresistance of ovarian cancer to carboplatin and olaparib [28]. A study of gastric cancer reported that CAMKK2 can overactivate microchromosome maintenance proteins in gastric cancer cells through MEK/ERK pathway to promote cancer cell proliferation [29]. Similarly, in our study, we also found that the MCM2 gene is highly expressed in gallbladder cancer, and higher the MCM2, worse the prognosis. This is consistent with results of previous studies, which speculated that MCM2 may play a role in progression of gallbladder cancer.
Although this paper has carried out rigorous bioinformatics analysis, there are still some shortcomings. Animal experiments with overexpression or knockdown of the gene were not performed in this study to further verify its function. Therefore, this aspect should be explored in depth in future studies.
To sum up, KNTC1 and MCM2 are highly expressed in gallbladder carcinoma, and may play a role in occurrence and development of gallbladder carcinoma through many ways. KNTC1 and MCM2 may be molecular targets for early diagnosis and precise treatment of gallbladder cancer, and provide a basis for the study of the mechanism of gallbladder cancer.

Gallbladder cancer data set
The gallbladder cancer dataset GSE139682 and GSE202479 were downloaded from GEO database generated by GPL20795 and GPL24676. GSE139682 included 10 gallbladder carcinomas and 10 normal samples, GSE202479 included 13 gallbladder carcinomas and 3 normal samples. It is used to identify the DEGs in gallbladder carcinoma.

Screening of DEGs
Probe aggregation and background correction of merge matrix of GSE139682 and GSE202479 using R package "limma". P value were adjusted using Benjamini-Hochberg method. The fold change (FC) is calculated using false discovery rate (FDR). The cutoff value of DEG is p less than 0.05 and FC is greater than 1.5. And make a visual representation of the volcano, and the intersection DEG of GSE139682 and GSE202479 is obtained by Wayne diagram.

Functional enrichment analysis
Gene Ontology (GO) and Kyoto Encyclopedia of Gene and Genome (KEGG) are computational methods for evaluating function and biological pathways of genetics. Differential genes screened by Wayne map was input into KEGG rest API obtained latest KEGG Pathway gene annotation, which was used as background. Gene set enrichment results were obtained using R package cluster Profiler.
Metascape (http://metascape.org/) is a gene function annotation and analysis tool that can realize the cognition of gene or protein function, and can be visually exported. We used Metascape to analyze functional enrichment of above differential gene list and derive it.

GSEA
GSEA is based on level-specific gene probes that evaluate data from microarrays and is a way to uncover genomic expression data through fundamental knowledge. According to degree of carotid atherosclerosis, samples were divided into normal sample and carotid atherosclerosis group. Relevant pathways and molecular mechanisms were evaluated. 5 is minimum gene set and 5000 is maximum gene set, and 1000 resampling times. The whole genome was analyzed by GO and KEGG. Developed by GSEA.

Construction and analysis of protein-protein interaction (PPI) network
Search Tool for the Retrieval of Interacting Genes (STRING) is a search system for known and predicted PPI. STRING database also contains the predicted results using bioinformatics methods. The differential genes were input into STRING to construct PPI network and predict the core genes. PPI network was visualized, core genes are predicted by Cytoscape software. First of all, we import PPI network into the Cytoscape, and then genes with best correlation were calculated by MCC and MNC. Finally, core genes were obtained after visualization.

Gene expression heat map
The expression of core genes in GSE139682 and GSE202479 PPI networks was mapped using the Rpacket heat map, and to visualize difference of core gene expression between gallbladder cancer and normal samples.

CTD analysis
CTD provides manually curated information on chemogene/protein interactions, gene-disease relationships, is a powerful public database. The core genes were input into CTD, so as to find the diseases most related to core gene. Excel was used to draw radar map of differential expression of each gene.

The miRNA
TargetScan (https://www.targetscan.org) can predict and analyze miRNA and target genes. Screening of miRNAs regulating central DEGs was performed using TargetScan in this study.

Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

AUTHOR CONTRIBUTION
Wei Jia performed the experiment, and he was a major contributor in writing. Wei Jia was involved in critically revising manuscript for important intellectual content. Wei Jia made substantial contributions to research conception and designed the draft of the research process. Chao Wang analyzed the animal data regarding atherosclerosis. Chao wang were major contributors in submitting the manuscript. Chao wang gives the technical support in the experimental methods. All authors read and approved the final manuscript.