Research Paper Volume 12, Issue 15 pp 15492—15503

Gene expression-based clinical predictions in lung adenocarcinoma

Yanlu Xiong1, *, , Jie Lei1, *, , Jinbo Zhao1, *, , Yangbo Feng1, , Tianyun Qiao1, , Yongsheng Zhou1, , Tao Jiang1, &, , Yong Han1,2, ,

  • 1 Department of Thoracic Surgery, Tangdu Hospital, Fourth Military Medical University, Xi'an, China
  • 2 Department of Thoracic Surgery, Air Force Medical Center, PLA, Beijing, China
* Equal contribution

Received: May 16, 2020       Accepted: July 6, 2020       Published: August 5, 2020
How to Cite

Copyright © 2020 Xiong et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.


Mining disease-related genes contributes momentously to handling lung adenocarcinoma (LUAD). But genetic complexity and tumor heterogeneity severely get in the way. Fortunately, new light has been shed by dramatic progress of bioinformatic technology in the past decades. In this research, we investigated relationships between gene expression and clinical features of LUAD via integrative bioinformatic analysis. First, we applied limma and DESeq2 packages to analyze differentially expressed genes (DEGs) of LUAD from GEO database and TCGA project (tumor tissues versus normal tissues), and acquired 180 down-regulated DEGs and 52 up-regulated DEGs. Then, we investigated genetic and biological assignment of theses DEGs by Bioconductor packages and STRING database. We found these DEGs were distributed dispersedly among chromosomes, enriched observably in extracellular matrix-related processes, and weighted hierarchically in interaction network. Finally, we established DEGs-based statistical models for evaluating TNM stage and survival status of LUAD. And these models (logistic regression models for TNM parameter and Cox regression models for survival probability) all possessed fine predictive efficacy (C-indexes: T, 0.740; N, 0.687; M, 0.823; overall survival, 0.678; progression-free survival, 0.611). In summary, we have successfully established gene expression-based models for assessing clinical characteristics of LUAD, which will assist its pathogenesis investigation and clinical intervention.


LUAD: lung adenocarcinoma; DEGs: differentially expressed genes; GEO: Gene Expression Omnibus; TCGA: The Cancer Genome Atlas; GO: gene ontology; KEGG: Kyoto Encyclopedia of Genes and Genomes; K-M: Kaplan-Meier; AIC: Akaike information criterion; AUC: area under the curve; OS: overall survival; PFS: progression-free survival; coef: coefficient; CI: confidence interval; ROC: receiver operating characteristic; Corr: correlation coefficient.