Research Paper Volume 16, Issue 7 pp 6455—6477

Machine learning for identifying tumor stemness genes and developing prognostic model in gastric cancer

Guo-Xing Li1, *, , Yun-Peng Chen2, *, , You-Yang Hu2, *, , Wen-Jing Zhao1, , Yun-Yan Lu1, , Fu-Jian Wan3, , Zhi-Jun Wu4, , Xiang-Qian Wang1, , Qi-Ying Yu1, ,

  • 1 Department of Oncology and Central Laboratory, Tumor Hospital Affiliated to Nantong University, Nantong, Jiangsu 226361, P.R. China
  • 2 Department of Oncology, The Affiliated Hospital of Nantong University, Nantong, Jiangsu 226361, P.R. China
  • 3 Institute of Biology and Medicine, College of Life and Health Sciences, Wuhan University of Science and Technology, Wuhan, Hubei 430081, P.R. China
  • 4 Department of Oncology, Nantong Hospital of Traditional Chinese Medicine, Nantong, Jiangsu 226361, P.R. China
* Equal contribution

Received: October 31, 2023       Accepted: March 13, 2024       Published: April 12, 2024
How to Cite

Copyright: © 2024 Li et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.


Gastric cancer presents a formidable challenge, marked by its debilitating nature and often dire prognosis. Emerging evidence underscores the pivotal role of tumor stem cells in exacerbating treatment resistance and fueling disease recurrence in gastric cancer. Thus, the identification of genes contributing to tumor stemness assumes paramount importance. Employing a comprehensive approach encompassing ssGSEA, WGCNA, and various machine learning algorithms, this study endeavors to delineate tumor stemness key genes (TSKGs). Subsequently, these genes were harnessed to construct a prognostic model, termed the Tumor Stemness Risk Genes Prognostic Model (TSRGPM). Through PCA, Cox regression analysis and ROC curve analysis, the efficacy of Tumor Stemness Risk Scores (TSRS) in stratifying patient risk profiles was underscored, affirming its ability as an independent prognostic indicator. Notably, the TSRS exhibited a significant correlation with lymph node metastasis in gastric cancer. Furthermore, leveraging algorithms such as CIBERSORT to dissect immune infiltration patterns revealed a notable association between TSRS and monocytes and other cell. Subsequent scrutiny of tumor stemness risk genes (TSRGs) culminated in the identification of CDC25A for detailed investigation. Bioinformatics analyses unveil CDC25A’s implication in driving the malignant phenotype of tumors, with a discernible impact on cell proliferation and DNA replication in gastric cancer. Noteworthy validation through in vitro experiments corroborated the bioinformatics findings, elucidating the pivotal role of CDC25A expression in modulating tumor stemness in gastric cancer. In summation, the established and validated TSRGPM holds promise in prognostication and delineation of potential therapeutic targets, thus heralding a pivotal stride towards personalized management of this malignancy.


OS: overall survival; DEG: differentially expressed gene; GEO: Gene Expression Omnibus; CDF: cumulative distribution function; AUC: Area Under Curve; PCA: principal components analysis; tSNE: t-distributed stochastic neighbor embedding; TME: tumor microenvironment; GS: gene significance; MM: module membership; CAF: cancer-associated fibroblasts; TAM: tumor-associated macrophage; MSC: mesenchymal stem cells.