Research Paper Volume 13, Issue 7 pp 10208—10224

One potential biomarker for teratozoospermia identified by in-depth integrative analysis of multiple microarray data

Baoquan Han1, *, , Lu Wang2, *, , Shuai Yu1, *, , Wei Ge2, , Yaqi Li3, , Hui Jiang4, , Wei Shen1,2, , Zhongyi Sun1, ,

  • 1 Urology Department, Peking University Shenzhen Hospital, Shenzhen Peking University and The Hong Kong University of Science and Technology Medical Center, Shenzhen 518036, China
  • 2 College of Life Sciences, Institute of Reproductive Sciences, Qingdao Agricultural University, Qingdao 266109, China
  • 3 Urology Department, Zaozhuang Hospital of Zaozhuang Mining Group, Zaozhuang 277100, China
  • 4 Department of Urology, Department of Andrology, Department of Human Sperm Bank, Peking University Third Hospital, Beijing 100191, China
* Joint first authors

Received: November 17, 2020       Accepted: February 16, 2021       Published: March 26, 2021
How to Cite

Copyright: © 2021 Han et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.


Teratozoospermia is a common category of male infertility and with the increase in clinical patients and the increasing sophistication of assisted reproductive technology, there is an urgent need for an accurate semen diagnostic biomarker to accomplish rapid diagnosis of patients with teratozoospermia and accurately assess the success rate of assisted reproductive technologies. In this study, we performed gene differential expression analysis on two publicly available DNA microarray datasets (GSE6872 and GSE6967), followed by GSEA analysis to parse their enriched KEGG pathways, and WGCNA analysis to obtain the most highly correlated modules. Subsequent in-depth comparative analysis of the modules screened into the two datasets resulted in a gene set containing the identical expression trend, and then the differentially expressed genes in the set were screened using the corresponding criteria. Finally, three differentially expressed genes common to both datasets were selected. In addition, we validated the expression changes of this gene using another dataset (GSE6968) and in vitro experiments, and only screened one potential semen biomarker gene whose expression trend was identical to those in other datasets, which will also provide an important theoretical basis for the diagnosis and treatment of teratozoospermia.


WGCNA: Weighted Gene Co-expression Network Analysis; DEGs: Differentially Expressed Genes; GSEA: Gene Set Enrichment Analysis; PPI: Protein-Protein interaction; GEO: Gene Expression Omnibus; SRA: Sequence Read Archive; GO: Gene Ontology; KEGG: Kyoto Encyclopedia of Genes and Genomes; STRING: Search Tool for the Retrieval of Interacting Genes/Proteins.