Research Paper Advance Articles

Integrative analysis of DNA methylation and gene expression reveals distinct hepatocellular carcinoma subtypes with therapeutic implications

Xiaowen Huang 1, *, , Chen Yang 2, *, , Jilin Wang 1, , Tiantian Sun 1, , Hua Xiong 1, ,

  • 1 State Key Laboratory of Oncogenes and Related Genes, Key Laboratory of Gastroenterology and Hepatology, Ministry of Health, Division of Gastroenterology and Hepatology, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai Jiao Tong University, Shanghai Cancer Institute, Shanghai Institute of Digestive Disease, Shanghai, China
  • 2 State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
* Equal contribution

received: December 9, 2019 ; accepted: March 2, 2020 ; published: March 22, 2020 ;
How to Cite

Copyright © 2020 Huang et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.


We aimed to develop an HCC classification model based on the integrated gene expression and methylation data of methylation-driven genes. Genome, methylome, transcriptome, proteomics and clinical data of 369 HCC patients from The Cancer Genome Atlas Network were retrieved and analyzed. Consensus clustering of the integrated gene expression and methylation data from methylation-driven genes identified 4 HCC subclasses with significant prognosis difference. HS1 was well differentiated with a favorable prognosis. HS2 had high serum α-fetoprotein level that was correlated with its poor outcome. High percentage of CTNNB1 mutations corresponded with its activation in WNT signaling pathway. HS3 was well differentiated with low serum α-fetoprotein level and enriched in metabolism signatures, but was barely involved in immune signatures. HS3 also had high percentage of CTNNB1 mutations and therefore enriched in WNT activation signature. HS4 was poorly differentiated with the worst prognosis and enriched in immune-related signatures, but was barely involved in metabolism signatures. Subsequently, a prediction model was developed. The prediction model had high sensitivity and specificity in distributing potential HCC samples into groups identical with the training cohort. In conclusion, this work sheds light on HCC patient prognostication and prediction of response to targeted therapy.


AFP: α-fetoprotein; APC: antigen presenting cells; CIMP: CpG island methylator phenotype; CNMF: consensus nonnegative matrix factorization; CNV: copy number variation; DEG: differentially expressed genes; ECM: extracellular matrix; EMT: epithelial mesenchymal transition; GSVA: Gene Set Variation Analysis; HCC: hepatocellular carcinoma; HS: HCC Subclass; ICI: immune checkpoint inhibitors; IFN: interferon; LASSO: Least Absolute Shrinkage and Selector Operation; MCP-counter: microenvironment cell populations-counter; MDG: methylation-driven gene; ML: machine learning; MST: median survival time; NTP: nearest template prediction; OS: overall survival; RF: random forest; RFS: recurrence free survival; ROC: receiver operating characteristic; RPPA: Reverse Phase Protein Array; TCGA: The Cancer Genome Atlas; TPM: Transcripts per kilobase million; TSG: tumor suppressor genes.