Research Paper Volume 13, Issue 13 pp 17592—17606

DNA methylation biomarkers for diagnosis of primary liver cancer and distinguishing hepatocellular carcinoma from intrahepatic cholangiocarcinoma

Yi Bai1, , Wen Tong2, , Fucun Xie3, , Liuyang Zhu2, , Hao Wu2, , Rui Shi1, , Lianjiang Wang1, , Long Yang1, , Zhisong Liu4, , Fei Miao4, , Qiang Zhao5, , Yaming Zhang1, &, ,

  • 1 Department of Hepatobiliary Surgery, Tianjin First Central Hospital, School of Medicine, Nankai University, Tianjin, China
  • 2 Tianjin First Central Hospital Clinic Institute, Tianjin Medical University, Tianjin, China
  • 3 Department of Liver Surgery, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS and PUMC), Beijing, China
  • 4 Department of Statistics, Tianjin University of Finance and Economics Pearl River College, Tianjin, China
  • 5 State Key Laboratory of Medicinal Chemical Biology, Key Laboratory of Bioactive Materials, Ministry of Education, and College of Life Science, Nankai University, Tianjin, China

Received: January 26, 2021       Accepted: May 17, 2021       Published: July 8, 2021
How to Cite

Copyright: © 2021 Bai et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.


Hepatocellular carcinoma (HCC) and intrahepatic cholangiocarcinoma (ICC) are the two most common pathology subtypes of primary liver cancer (PLC). Identifying DNA methylation biomarkers for diagnosis of PLC and further distinguishing HCC from ICC plays a vital role in subsequent treatment options selection. To obtain potential diagnostic DNA methylation sites for PLC, differentially methylated CpG (DMC) sites were first screened by comparing the methylation data between normal liver samples and PLC samples (ICC samples and HCC samples). A random forest algorithm was then used to select specific DMC sites with top Gini value. To avoid overfitting, another cohort was taken as an external validation for evaluating the area under curves (AUCs) of different DMC sites combination. A similar model construction strategy was applied to distinguish HCC from ICC. In addition, we identified DNA Methylation-Driven Genes in HCC and ICC via MethylMix method and performed pathway analysis by utilizing MetaCore. Finally, we not only performed methylator phenotype based on independent prognostic sites but also analyzed the correlations between methylator phenotype and clinical factors in HCC and ICC, respectively. To diagnose PLC, we developed a model based on three PLC-specific methylation sites (cg24035245, cg21072795, and cg00261162), whose sensitivity and specificity achieved 98.8%,94.8% in training set and 97.3%,81% in validation set. Then, to further divide the PLC samples into HCC and ICC, we established another mode through three methylation sites (cg17769836, cg17591574, and cg07823562), HCC accuracy and ICC accuracy achieved 95.8%, 89.8% in the training set and 96.8%,85.4% in the validation set. In HCC, the enrichment pathways were mainly related to protein folding, oxidative stress, and glutathione metabolism. While in ICC, immune response, embryonic hepatocyte maturation were the top pathways. Both in HCC and ICC, methylator phenotype correlated well with overall survival time and clinical factors involved in tumor progression. In summary, our study provides the biomarkers based on methylation sites not only for the diagnosis of PLC but also for distinguishing HCC from ICC.


HCC: hepatocellular carcinoma; ICC: intrahepatic cholangiocarcinoma; PLC: primary liver cancer; DMC: differentially methylated CpG; AUCs: area under curves; H-ChC: hepatocellular-cholangiocarcinoma; CGI: CpG island; TSG: tumor suppressor gene; GDH: genome DNA hypomethylation; CIMP: CpG island methylator phenotype; GEO: Gene Expression Omnibus; TCGA: The Cancer Genome Atlas.