Research Paper Volume 12, Issue 4 pp 3747—3770

Development of a four-gene prognostic model for pancreatic cancer based on transcriptome dysregulation

Jie Yan1, , Liangcai Wu2, , Congwei Jia1, , Shuangni Yu1, , Zhaohui Lu1, , Yueping Sun3, , Jie Chen1, ,

  • 1 Department of Pathology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100730, China
  • 2 Department of Obstetrics and Gynecology, Obstetrics and Gynecology Hospital of Fudan University, Shanghai 200011, China
  • 3 Institute of Medical Information, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100020, China

Received: November 2, 2019       Accepted: February 4, 2020       Published: February 20, 2020
How to Cite

Copyright © 2020 Yan et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.


We systematically developed a prognostic model for pancreatic cancer that was compatible across different transcriptomic platforms and patient cohorts. After performing quality control measures, we used seven microarray datasets and two RNA sequencing datasets to identify consistently dysregulated genes in pancreatic cancer patients. Weighted gene co-expression network analysis was performed to explore the associations between gene expression patterns and clinical features. The least absolute shrinkage and selection operator (LASSO) and Cox regression were used to construct a prognostic model. We tested the predictive power of the model by determining the area under the curve of the risk score for time-dependent survival. Most of the differentially expressed genes in pancreatic cancer were enriched in functions pertaining to the tumor immune microenvironment. The transcriptome profiles were found to be associated with overall survival, and four genes were identified as independent prognostic factors. A prognostic risk score was then proposed, which displayed moderate accuracy in the training and self-validation cohorts. Furthermore, patients in two independent microarray cohorts were successfully stratified into high- and low-risk prognostic groups. Thus, we constructed a reliable prognostic model for pancreatic cancer, which should be beneficial for clinical therapeutic decision-making.


WGCNA: Weighted gene co-expression network analysis; LASSO: Least absolute shrinkage and selection operator; ROC curve: Receiver operating characteristic curve; AUC: Area under the ROC curve; DEGs: Differentially expressed genes; cTNM: Clinical tumor-node-metastasis staging; pTNM: Pathological tumor-node-metastasis staging; GEO: Gene Expression Omnibus; GO: Gene Ontology; BP: Biological process; CC: Cellular component; MF: Molecular function; TCGA: The Cancer Genome Atlas; GTEx: Genotype-tissue expression; TFs: Transcription factors; HR: Hazard ratio; CI: Confidence interval; FC: Fold change; DSG: Desmoglein; TRIPOD: Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis; TOM: Topological overlap matrix; PDAC: pancreatic ductal adenocarcinoma.