Establishment and verification of a prognostic model based on coagulation and fibrinolysis-related genes in hepatocellular carcinoma

Background: Studies have shown that coagulation and fibrinolysis (CFR) are correlated with Hepatocellular carcinoma (HCC) progression and prognosis. We aim to build a model based on CFR-correlated genes for risk assessment and prediction of HCC patient. Methods: HCC samples were selected from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) databases respectively. The Molecular Signatures Database (MSigDB) was used to select the CFR genes. RiskScore model were established by single sample gene set enrichment analysis (ssGSEA), weighted correlation network analysis (WGCNA), multivariate Cox regression analysis, LASSO regression analysis. Results: PCDH17, PGF, PDE2A, FAM110D, FSCN1, FBLN5 were selected as the key genes and designed a RiskScore model. Those key genes were Differential expressions in HCC cell and patients. Overexpression PDE2A inhibited HCC cell migration and invasion. The higher the RiskScore, the lower the probability of survival. The model has high AUC values in the first, third and fifth year prediction curves, indicating that the model has strong prediction performance. The difference analysis of clinicopathological features found that a great proportion of high clinicopathological grade samples showed higher RiskScore. RiskScore were positively correlated with immune scores and TIDE scores. High levels of immune checkpoints and immunomodulators were observed in high RiskScore group. High RiskScore groups may benefit greatly from taking traditional chemotherapy drugs. Conclusions: We screened CFR related genes to design a RiskScore model, which could accurately evaluate the prognosis and survival status of HCC patients, providing certain value for optimizing the clinical treatment of cancer in the future.


INTRODUCTION
HCC is a serious threat to human health with strong invasion and high mortality [1].In 2020, an estimated 9.06 million new cases of primary liver cancer were diagnosed all over the world, and about 8.3 million people died from liver cancer.The occurrence and death rate of liver cancer have been increasing annually [2].Patients with cirrhosis have a very high likelihood of converting to liver cancer, and imaging and ultrasound monitoring are often used in early diagnosis of tumor when treatment is most effective [3].Due to the allocation of medical resources and the latent nature of early liver cancer, most patients are diagnosed in the middle and late stages, which will increase the cost and risk of treatment [4,5].Despite advances in the treatment of liver cancer [6], most patients with HCC do not respond well to cell programmed death 1 (anti-PD1) therapy [7].Therefore, the identification of key molecular features [8] with advanced techniques [9,10] related to HCC is helpful for the specific mechanism research of HCC, and may provide new ideas for the search for more potential therapeutic targets and prognostic diagnosis in the future.
In recent years, the interaction between the growth of malignant tumor cells and the coagulation and fibrinolytic (CFR) systems has attracted much attention.Experiments have shown that targeting CFR systems may promote the course of malignant disease [11].Cancers such as HCC are more likely to cause the formation of a variety of blood clots, including portal vein thrombosis [12].Anticoagulant therapy, such as the application of heparin, may help us with antitumor therapy and can be involved in the intervention of many cancers and other diseases [13].The tumor microenvironment (TME) and the coagulation cascade are related to malignancies [14].The TME is a key mediator of cancer progression and treatment outcome, and is closely related to the outcome of immunotherapy [15].HCC studies have found a significant correlation between coagulation-related genes and TME, and the coagulation-related genes have the potential as prognostic biomarkers [16].Plasma fibrinogen is associated with the development and prognosis of a variety of cancers and can be used as a prognostic biomarker of some cancers [17].Hyun Kyung Kim et al. found higher D-dimer and plasma fibrinogen levels in patients with advanced HCC, as well as a similarly high level of tumor thrombosis, suggesting that CFR continues to occur during tumor progression [18].After HCC resection, there was also a significant increase in levels of the plasminin-alpha-2-plasminin-inhibitor complex [19].Fibrinogen-like-protein 1 (FGL1) belongs to a relatively common branch of the fibrinogen-related protein (FREP) family.High level of FGL1 expression is associated with the development and prognosis HCC and is a potential prognostic biomarker [20].Coagulation-related genes and fibrinolytic systems are associated with the development and prognostic effects of HCC.
At present, the relationship between CFR related genes and HCC development and prognosis remains unclear.In this study, we will enrich and screen CFR related genes based on multiple data sets to design a RiskScore model and test the predictive function of the model.Next, we will analyze the correlation between RiskScore and clinical indicators.We hope that this model can provide a new reference for the study of the pathogenesis of HCC and the optimization of clinical treatment plan.

Data acquisition and preprocessing
(1) RNA-Seq data of TCGA-HCC were acquired from The Cancer Genome Atlas (TCGA) database using the GDC API [21].371 HCC samples and 52 para-cancer tissue samples were acquired.HCC samples were retained according to the clinicopathological information table, and samples with no survival state and time were removed, then log2 (TPM+1) was used in calculating the mRNA expression level.A total of 355 samples were screened.Detailed clinicopathological information was listed in Table 1.
(2) GSE14520 expression data of HCC including 221 tumor samples were acquired from The Gene Expression Omnibus (GEO) database [22].The GEO data was preprocessed as follows: We obtained the annotation data contained in the chip platform, and mapped the probe to the gene according to the annotation information to obtain the required expression level of each gene, then log2 (TPM+1) was used in calculating the mRNA expression level.A total of 221 samples were picked out.Detailed clinicopathological information was listed in Table 1.

Screening and enrichment of differential expressed genes (DEGs)
We used limma packages to screen for DEGs between HCC and normal samples (P < 0.05) [24].We used the ggplot2 package to make the volcano plot of DEGs [25].Finally, we chose the clusterProfiler package for gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis [26].

WGCNA
WGCNA was used to select the gene modules most associated with CFR scores [27].Firstly, the half of genes with the smaller mean absolute deviation [4] in the gene expression profile were excluded.The soft threshold (β) is obtained from the pickSoftThreshold() function included in WGCNA.Next, the gene modules were judged and divided by hierarchical clustering method, and the criteria of at least 50 genes per module (minModuleSize = 50) was used to find gene modules.Then we look for associations between modules and sample properties.Finally, we obtained the genes in the module showing the highest correlation and defined them as CFR related genes.

Screening of key genes
DEGs was intersected with the module with the highest correlation, and the significant prognostic genes were acquired by univariate Cox regression analysis using coxph function of survival package (P < 0.05) [28].The LASSO regression algorithm and randomForest algorithm were used to precisely narrow down the range of significant prognostic genes [29,30].The intersection of genes obtained by LASSO and randomForest algorithm was employed to reduce genes in the model by stepwise regression.Finally, we AGING obtained the key genes that influence prognosis, and drew the multivariate forest plot of the key genes.

Establishment and verification of RiskScore model
The TCGA-HCC dataset was used as a training set, and the GSE14520 dataset was a verification set.RiskScore is calculated as follows: Where i refers to the expression level of key genes, and β is the Cox regression coefficient of key genes.
Next, we calculated RiskScore for each sample of TCGA-HCC dataset.Patients are differentiated into high RiskScore and low RiskScore groups based on the most appropriate RiskScore threshold obtained from the survminer package.We used timeROC package for ROC analysis to analyze the prognostic efficiency of 1, 3 and 5 years.The AUC value of ROC curve was obtained and analyzed to judge the accuracy of the RiskScore model.Kaplan-Meier curve was made to analyze the prognostic survival chance.To better verify the model robustness, the GSE14520 dataset was validated using the same method.

Analysis of clinicopathological features and construction of a nomogram
We drew a clustering heat map to compare the differences in clinicopathological features between RiskScore groups in TCGA-HCC dataset.Violin plots were also drawn to analyse the differences in clinicopathological grades between RiskScore groups.Univariate and multivariate Cox regression analysis was performed for Riskscore and clinicopathological features, and independent prognostic factors were selected.To assess the survival risk of patients, we combined RiskScore and independent prognostic factors to establish a nomogram through the rms package.Calibration curve is used to evaluate the prediction accuracy of the model.Besides, ggDCA package was used to make Decision curve (DCA) to analyse the reliability of the model [31].Finally, ROC curve of various clinicopathological features on overall survival (OS) in 1-5 years was drawn [32].

Gene set enrichment analysis (GSEA)
CFR score of TCGA-HCC samples and normal samples was acquired by ssGSEA [33].The expression profiles of TCGA data sets were processed by Gene Set Variation Analysis (GSVA), and the score of each sample in the HALLMARK Pathway was calculated [34].The correlation between RiskScore and channel score was calculated.We analyzed the differentially activated pathways, and GSEA was performed using theh.all.v7.5.1.symbols.gmtgene set in the MSigDB database [35].FDR < 0.05 is a significant enrichment.Mutated data set of TCGA processed by mutect2 software was downloaded.We used Fisher's exact test to acquire genes with high-frequency mutations in different groups, and drew a waterfall map.

Immune microenvironment and drug sensitivity analysis
The relative abundance of 22 kinds of immune cells was determined by CIBERSORT package [36].Six immune cell scores were calculated by TIMER [37].We have obtained immunomodulators related to a variety of different immune processes (antigen presentation, cell adhesion, co-inhibitors, co-stimulants, ligands, receptors, and others) from existing studies [38].The potential clinical effects of immunotherapy across RiskScore groups were analyzed using Tumor Immune Dysfunction and Exclusion (TIDE) method [39].Finally, the sensitivity of different RiskScore groups to conventional chemotherapy drugs was calculated using pRRophetic package [40].

Cell culture and transfection
HCC cell lines (HUH7) and Normal liver cell line (L0-2) were bought from the American Type Culture Collection (ATCC) cell bank.Cells were cultured in DMEM culture medium (Gibco, USA), supplemented with 10% FBS under a standard humidified incubator (5% CO2 at 37°C).The negative control (oe NC) and oe CEMIP (Sagon, China) were sub-cloned into the pCDNA3.1 expression vector (Invitrogen, Carlsbad, USA) and transfected into the cells by Lipofectamine 3000 reagent (Invitrogen, MA, USA).

Quantitative real-time polymerase chain reaction (qRT-PCR)
Total RNA was extracted and reverse transcribed (Takara Bio, China).The expressions of different genes were detected by the PowerUp SYBR Master Mix Applied Biosystems (Invitrogen, USA).The sequences of primer pairs for the genes targeted were listed in Table 2. Gene mRNA levels were normalized to GAPDH mRNA levels with the formula: 2-( CT target-CT GAPDH ), where CT is the threshold cycle.

Migration and invasion assays
Migration and invasion experiments were carried out three times as described previously [41].Briefly, cell invasion was tested employing Matrigel-coated transwell tinserts (Transwell, Corning Costar) with a total of 4 × 10 4 cells were added to the upper chamber.After incubation at 37°C for 24 hours, the non-invading cells were removed, and invaded cells were counted in five random fields with a microscope.The procedure for cell migration is similar to that of invasion, except for no matrigel.

Statistical analysis
This study mainly uses R software for statistical analysis.The comparison between the different groups was analyzed by Wilcoxon test.The Spearman method was used to calculate correlation.The P-values with significant differences were all less than 0.05.

Availability of data and materials
The datasets generated and/or analyzed during the current study are available in the (GSE14520) repository, (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=G SE14520).

Screening and enrichment of DEGs
The TCGA-HCC sample had a lower CFR score (Figure 1A).6724 up-regulated genes and 822 downregulated genes were obtained by screening the DEGs between TCGA-HCC samples and normal samples (Figure 1B).
We used GO and KEGG enrichment analysis of genes up-regulated and found that the results were significantly enriched in cell cycle related pathways and biological processes.According to the enrichment results of top 10 KEGG pathways, the DNA replication pathway was the most significant.From the results of cellular component, nuclear chromosome part and nuclear chromosome are the most significant components.In terms of biological process results, ncRNA metabolic process and mRNA processing were the most significant biological processes.From the results of molecular function, structural constituent of ribosome was the most significant function (Figure 1C).

WGCNA identifies genes associated with CFR score
The samples were clustered to screen co-expression modules.When the soft threshold β = 22 is selected, the network is scale-free (Figure 2A).Next, hierarchical clustering was used to divide gene modules.Six co-expression modules were generated after the combination of modules, among which grey module contains genes that cannot be classified into other modules.The number of genes in each module was shown in Figure 2B.Comparison of correlation between each module and clinicopathological features showed that lightcyan module had a positive correlation with CFR score (Figure 2C).Next, we conducted gene enrichment analysis in lightcyan module, and these genes were significantly enriched in biological processes related to angiogenesis (Figure 2D).

Screening of key genes
By intersecting the DEGs and lightcyan modules, 159 genes were obtained (Figure 3A).Univariate Cox analysis of 159 genes showed that 41 genes with greater prognostic influence were identified by consensus (P < 0.05).According to LASSO analysis, the number of independent variable coefficients approaching 0 rose  together with the progressive increase in lambda (Figure 3B).The model was built using 10-fold crossvalidation, which was then used to examine confidence intervals.We chose 13 genes as candidate genes at lambda = 0.0277, as this was when the model was the most accurate (Figure 3C).According to randomForest algorithm screening, we obtained the importance of genes and ranked the top 20 genes for further analysis (Figure 3D).The genes obtained by the two algorithms were intersected, and there were 9 overlapping genes (Figure 3E).Then the range of genes is compressed by stepwise regression.Finally, 6 genes were acquired as key genes affecting prognosis, including PCDH17, PGF, PDE2A, FAM110D, FSCN1 and FBLN5.Subsequently, multivariate forest map of key genes was drawn (Figure 3F).
After calculating the RiskScore of each sample in TCGA-HCC, they were divided into high and low RiskScore groups for ROC analysis of prognosis.The prognosis classification of prediction at 1, 3 and 5 years was analyzed respectively, and the results showed that the AUC at 1 year was 0.75, the AUC at 3 years was 0.73, and the AUC at 5 years was 0.69, indicating that the model had a high AUC value (Figure 4A).Meanwhile, the survival probability of the low RiskScore group was significantly better (P < 0.0001, Figure 4B, 4C).The expression of key genes in the high and low RiskScore groups showed that FBLN5, PDE2A and FAM110D were more prominent in the low RiskScore group, while PCDH17, PGF and FSCN1 were more prominent in the high RiskScore group (Figure 4D).We used the same method to verify the GSE14520 dataset, and the results showed that the AUC value was still high at 1, 3 and 5 years (Figure 4E).The survival probability of the low RiskScore group was significantly higher (Figure 4F, 4G).The expression level of key genes in GSE14520 dataset was similar to that of the TCGA-HCC dataset (Figure 4H).Therefore, the model has good predictive ability.

Analysis of clinicopathological features and verification of a nomogram
By comparing the clinicopathologic characteristics between the different RiskScore groups in the TCGA cohort, the higher the RiskScore, the larger the proportion of samples with high clinicopathological grade (Figure 5A).The differences in RiskScore among different clinicopathological grades, including T.stage, N.stage, M.stage, Stage and Grade, were also compared.The results showed that RiskScore also increased with the increase of each clinicopathological grade (Figure 5B).We also compared the prognostic differences between high and low risk groups in different Gender, Age Stages, and Grade groups in the TCGA dataset.We found that our risk group displayed significant difference in survival across different clinical groups, demonstrating that risk grouping has strong independence and is not easily influenced by other clinical factors (Supplementary Figure 1).
The P-values of RiskScore and Stage were both less than 0.001 in univariate Cox analysis (Figure 5C).The P-values of RiskScore and Stage were both less than 0.005 in the multivariate Cox analysis (Figure 5D).Stage and RiskScore were important prognostic variables.A nomogram was created using Stage, RiskScore, and Stage combined.The findings demonstrated that RiskScore had the greatest influence on the prediction of survival chances (Figure 5E).The fact that the 1, 3, and 5 year(s) of anticipated calibration points were near the standard curve indicated that the nomogram had a strong predictive potential (Figure 5F).DCA results show that RiskScore and nomogram have higher benefits than extremum curves, so they both have strong stability and prediction ability (Figure 5G).Compared with other clinicopathological features, both nomogram and RiskScore showed the better survival prediction ability (Figure 5H).

Immunological abnormalities and drug susceptibility between RiskScore groups
The relative abundance of 22 distinct immune cell types was calculated using the CIBERSORT method, and 12 immune cell types, such as T cells CD4 memory active and T cells CD4 memory resting, significantly differed across various RiskScore groups (Figure 6A).The immunity scores were calculated using the TIMER, and the high RiskScore group had a higher immunity score (Figure 6B).Common immune checkpoints in the high RiskScore group showed high expression (Figure 6C).We compared the expression of common immunomodulators in different RiskScore groups, and the high RiskScore group also showed high expression (Figure 6D).Finally, TIDE was used to analyze the potential clinical effects of immunotherapy between RiskScore groups, and we found that RiskScore and TIDE score showed a significant negative correlation, suggesting that immunotherapy may be favorable in the high RiskScore group (Figure 6E).Additionally, we analyzed how various RiskScore groups responded to conventional chemotherapy medications.More patients in the low RiskScore category were susceptible to erlotinib.Five conventional chemotherapy treatments, namely, S-Trityl-L-cysteine, Cyclopamine, Paclitaxel, and Crizotinib, were more likely to be advantageous to the high RiskScore group (Figure 6F).

Abnormal biological pathways and gene mutations between different RiskScore groups
The correlation between RiskcSore and HALLMARK pathway score was calculated.RiskScore was positively correlated with cell cycle related pathways, and G2M CHECKPOINT was the most positively correlated pathway.RiskScore was negatively correlated with metabolism related pathways, and FATTY ACID METABOLISM was the most negatively correlated pathway (Figure 7A).The high RiskScore group was mainly enriched in cell cycle related pathways, while the low RiskScore group had no significantly enriched pathways (Figure 7B).The analysis of the differences of mutant gene characteristics between different groups showed that the mutation frequency of TP53 was higher in the high RiskScore group, and more missense mutations occurred (Figure 7C).

Validation the expression of model genes in HCC cell lines
In this work, we selected 6 hub genes for RiskScore model construction.The expressions of these genes were validated in a vitro experiment.As seen in Figure 8A-8F, we could see that the levels of FAM110D, FBLN5 and PDE2A were all elevated in HUH7 cells compared with L0-2 cells (p < 0.01).While the levels of PCDH17, PGF and FSCN1 displayed opposite tendencies.IHC analysis also had similar results (Supplementary Figure 2).

Overexpression of PDE2A depressed cell metastasis of HCC cells
As PDE2A was negatively correlated with tumor growth status, vascular infiltration, and promoted the HCC progression [42], and metastasis is one of the leading causes of cancer death [43].Thereby, we detected the action of PDE2A on HCC development via cell migration and invasion assays.As displayed in

DISCUSSION
Abnormal expression of CFR related genes has been detected in a variety of cancers.Coagulation phenomena such as thromboembolism, including venous and arterial events, are frequently found in cancer patients and are directly or indirectly associated with cancer mortality, morbidity and the need for anticoagulant therapy [44].Currently, the involvement of coagulation-related proteins in angiogenesis and cancer cell proliferation has been studied, and it has been found that inhibiting the clotting system may be helpful in reducing the development of tumors [45].K-D Chen et al. found that tissue factor (TF) and coagulation factor VII (FVII) can jointly act on protease-activating receptor 2 (PAR2) to promote tumor growth, so coagulation factor can also accelerate tumor progression under certain conditions [46].The Plasma TF levels in HCC patients affect tumor differentiation, proliferation and cancer progression to some extent [47].After transcatheter hepatic artery embolization, thrombin-antithrombin III complex (TAT) levels in HCC patients were significantly increased, while antithrombin III and antiplasmin (α-2-plasmin inhibitor) levels were significantly decreased [48].However, there are few studies on the association between HCC and CFR.Therefore, a RiskScore model was designed by combining the characteristics of CFR related genes in HCC to better assess the prognosis of patients.
In this study, six key genes, PCDH17, PGF, PDE2A, FAM110D, FSCN1 and FBLN5, were found to have a great influence on survival probability, and the model was built based on them.There have been studies showing strong links between these key genes and cancer.According to Yan Liu et al., a lower expression level of PCDH17 in HCC tissues may contribute to patients' poor prognoses, while higher levels of PCDH17 expression can more effectively suppress HCC from growing and metastasizing [49].We detected that PCDH17 was influenced by other genes in HCC.For example, Xiang, Y et al. found that MiR-23a-3p could facilitate G1/S cell cycle transition by inhibiting the expression of PCDH17 in HCC [50].Also, PCDH17 is modulated by methylation of DNMT3B and had an impact on the malignant biological action on HCC [49].Alpini et al. found that the expression level of PGF protein in HCC tissues was significantly higher, which may be helpful in inducing tumor angiogenesis, which may be related to the prognostic recurrence of HCC [51].The expression of PDE2A was down-regulated in various cancer types including HCC.PDE2A was negatively correlated with tumor growth status, vascular infiltration, and promoted the HCC progression through modulating mitochondrial morphology and ATP content [42] as well influencing cell cycle of HCC [52].In our cell experiment, we also found that elevated PDE2A could repress the metastasis of HCC cells.The immune function of HCC patients with low expression of PDE2A showed a downward trend [42].There are few studies on FAM110D gene, but some studies have found that another FAM110 gene, FAD110B, can inhibit the growth and invasion of lung adenocarcinoma by inactivating the Wnt/β-catenin signaling pathway [53].FAM110A may induce the occurrence and progression of pancreatic cancer [54].Here the reduced levels of FAM110D in HCC cells may imply its anticancer effect.Fascin homologous 1 (FSCN1) was identified as a direct target of miR-539, a tumor suppressor, and its high expression level enhanced the aggressiveness of HCC cells [55].Similarly, its high expression level was also observed in our validating experiment.Jia-Cheng Tang et al. found that FBLN-5 expression was down-regulated in HCC.FBLN-5 can inhibit tumor cell movement and proliferation through integrin-dependent mechanism, and its downregulation is significantly associated with advanced cancer cell metastasis [56].Collectively, in this project, these genes were selected and validated as the key genes affecting prognosis.
According to the above results, the higher the RiskScore, the lower the survival probability, and the higher the proportion of samples with high clinicopathological grade.Moreover, in pathway analysis, RiskScore was positively correlated with the expression level of cell cycle related pathways, which might be related to abnormal regulation of cancer cell cycle [57].It was found that G2M CHECKPOINT was the pathway with the strongest positive correlation, and FATTY ACID METABOLISM was the pathway with the strongest negative correlation.The alteration of lipid metabolism pathway may influence the occurrence, proliferation and spread of cancer [58].The analysis of gene characteristics of gene mutation showed that patients with high RiskScore had a significantly increased probability of TP53 mutations.TP53, an important tumor suppressor gene, is frequently observed to have mutations in the genome of patients with various cancers.Mutation of TP53 will reduce its ability to inhibit tumor proliferation.And mutated p53 protein does not have normal biological function and may even induce tumor [59].We also found significant increases in immune checkpoint and immunomodulator expression levels in the high RiskScore state.The reason may be that overexpression of immune checkpoint genes may hinder immune activation in HCC and promote the escape and spread of cancer cells [60].Next, the limitations and development directions of this study will be elaborated.The collection amount of tumor sample data and gene sample data is small, and there is a lack of complete clinicopathological studies.More samples of clinicopathological data need to be integrated for later analysis.In addition, the specific role of CFR related genes in HCC and the pathogenesis of HCC have not been clarified, and further studies are still needed for HCC.

CONCLUSIONS
In this study, CFR related genes were enriched and screened, and six key genes, namely PCDH17, PGF, PDE2A, FAM110D, FSCN1 and FBLN5, were obtained.The RiskScore model was constructed based on six key genes, which could effectively predict the survival of patients, and may provide new ideas for targeted therapy and specific mechanism research of HCC.

Figure 1 .
Figure 1.Screening and enrichment of DEGs.(A) Difference in CFR score between tumor and normal control samples in the TCGA cohort; (B) Volcano plot for screening differential expressed genes in the TCGA cohort; (C) Enrichment of tumor up-regulated genes by GO and KEGG analysis.

Figure 2 .
Figure 2. WGCNA identifies genes associated with CFR score.(A) Selection of soft threshold (β); (B) The number of module genes; (C) Correlation between feature vectors of modules and clinicopathological information; (D) Gene functional enrichment analysis of light cyan module.

Figure 3 .
Figure 3. Screening of key genes.(A) Intersection of differential expressed genes and WGCNA screened module genes; (B) The trajectory of each independent variable; (C) Confidence interval for each lambda; (D) Sequencing of genes screened by randomForest algorithm; (E) The result intersection of LASSO algorithm and randomForest algorithm; (F) Multivariate forest map of key genes.

Figure 4 .
Figure 4. Design and verification of RiskScore model.(A) ROC curves for 1, 3 and 5 years of RiskScore model in TCGA-HCC; (B) KM curve of different RiskScore groups in TCGA-HCC; (C) Survival state between different RiskScore groups of TCGA-HCC; (D) Expression of key genes in TCGA-HCC; (E) ROC curves for 1, 3 and 5 years of RiskScore model in GSE14520 dataset; (F) KM curve of different RiskScore groups in GSE14520 dataset; (G) Survival state between different RiskScore groups of GSE14520 dataset; (H) Expression of key genes in GSE14520 dataset.* p < 0.05.

Figure 5 .
Figure 5. Analysis of clinicopathological and nomogram.(A) RiskScore and distribution of clinicopathological features; (B) Analysis of different clinicopathological grades between different RiskScore groups; (C) Univariate Cox analysis of RiskScore and clinicopathological features; (D) Multivariate Cox analysis of clinicopathological features and RiskScore; (E) Construction of the nomogram; (F) 1, 3, 5 years calibration curve of the nomogram; (G) The decision curve of RiskScore, nomogram and clinicopathological features; (H) The ROC curves of a variety of clinicopathological features for overall survival (OS) at 1~5 years.

Figure 6 .
Figure 6.Immunological abnormalities and drug susceptibility between RiskScore groups.(A) Differences in immune infiltration by CIBERSORT algorithm; (B) Differences in immune score by TIMER; (C) Analysis of immune checkpoint gene expression; (D) Differences in the expression of immunomodulators; (E) Correlation between RiskScore and TIDE score; (F) Comparison of drug sensitivity between different RiskScore groups.

Figure 7 .
Figure 7. Abnormal biological pathways and gene mutations between different RiskScore groups.(A) Correlation between RiskScore and HALLMARK pathway score; (B) GSEA analysis between RiskScore groups; (C) Waterfall map of mutant gene differences between different RiskScore groups.