Research Paper Volume 12, Issue 13 pp 13502—13517

Weighted correlation gene network analysis reveals a new stemness index-related survival model for prognostic prediction in hepatocellular carcinoma

Qiujing Zhang1, *, , Jia Wang1,2, *, , Menghan Liu3, , Qingqing Zhu1, , Qiang Li4, , Chao Xie1, , Congcong Han1, , Yali Wang1, , Min Gao5, , Jie Liu1, ,

  • 1 Department of Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan 250117, Shandong, China
  • 2 Department of Oncology, Zibo Maternal and Child Health Hospital, Zibo 255000, Shandong, China
  • 3 Basic Medicine College, Shandong First Medical University, Taian 271016, Shandong, China
  • 4 Department of Oncology, Mengyin County Hospital, Linyi 276299, Shandong, China
  • 5 Department of Radiotherapy, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan 250117, Shandong, China
* Equal contribution

Received: February 13, 2020       Accepted: May 27, 2020       Published: July 9, 2020
How to Cite

Copyright © 2020 Zhang et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.


In this study, we constructed a new survival model using mRNA expression-based stemness index (mRNAsi) for prognostic prediction in hepatocellular carcinoma (HCC). Weighted correlation network analysis (WGCNA) of HCC transcriptome data (374 HCC and 50 normal liver tissue samples) from the TCGA database revealed 7498 differentially expressed genes (DEGs) that clustered into seven gene modules. LASSO regression analysis of the top two gene modules identified ANGPT2, EMCN, GLDN, USHBP1 and ZNF532 as the top five mRNAsi-related genes. We constructed our survival model with these five genes and tested its performance using 243 HCC and 202 normal liver samples from the ICGC database. Kaplan-Meier survival curve and receive operating characteristic curve analyses showed that the survival model accurately predicted the prognosis and survival of high- and low-risk HCC patients with high sensitivity and specificity. The expression of these five genes was significantly higher in the HCC tissues from the TCGA, ICGC, and GEO datasets (GSE25097 and GSE14520) than in normal liver tissues. These findings demonstrate that a new survival model derived from five strongly correlating mRNAsi-related genes provides highly accurate prognoses for HCC patients.


HCC: hepatocellular carcinoma; DEGs: differentially expressed genes; AFP: α-fetoprotein; PIVKA-II: Protein induced by vitamin K absence-II; DCP: Des-gamma carboxyprothrombin; NGS: new generation sequencing; PCDH19: Protocadherin 19; GPC3: Glypican-3; CYP3A4: Cytochrome P450 Family 3 Subfamily A Member 4; YTHDF1: YTH N6-Methyladenosine RNA Binding Protein 1; DCAF13: DDB1 and CUL4 associated factor 13; WGCNA: Weighted correlation network analysis; mRNAsi: mRNA expression-based stemness index; LASSO: the least absolute shrinkage and selection operator; TCGA: The Cancer Genome Atlas; GMs: gene modules; MS: module significance; EREG-mRNAsi: epigenetically regulated mRNAsi; GS: gene significance; MM: module membership; ANGPT2: Angiopoietin 2; EMCN: Endomucin; GLDN: Gliomedin; USHBP1: USH1 Protein Network Component Harmonin Binding Protein 1; ZNF532: Zinc Finger Protein 532; ROC: receiver operating characteristic; AUC: area under the curve; ICGC: International Cancer Genome Consortium; GEO: the Gene Expression Omnibus; NPM1: nucleophosmin; VETC: vessels that encapsulated tumor clusters; VEGF: vascular endothelial growth factor; VEGFR2: vascular endothelial growth factor receptor 2; OS: overall survival; GWAS: Genome-Wide Association Study; LIHC: liver hepatocellular carcinoma; FDR: false discovery rate; FC: fold change; MEs: module eigengenes.