Abstract

Non-obstructive azoospermia (NOA) is a severe form of male infertility, but its pathological mechanisms and diagnostic biomarkers remain obscure. Since the dysregulation of RNA-binding proteins (RBPs) had nonnegligible effects on spermatogenesis, we aimed to investigate the functions and diagnosis values of RBPs in NOA. 58 testicular samples (control = 11, NOA = 47) from Gene Expression Omnibus (GEO) were set as the training cohort. Three public datasets, containing GSE45885 (control = 4, NOA = 27), GSE45887 (control = 4, NOA = 16), and GSE145467 (control = 10, NOA = 10), and 44 clinical samples from the local hospital (control = 27, NOA = 17) were used for validation. Through a series of bioinformatical analyses and machine learning algorithms, including genomic difference detection, protein-protein interaction network analysis, LASSO, SVM-RFE, and Boruta, DDX20 and NCBP2 were determined as significant predictors of NOA. Single-cell RNA sequencing of 432 testicular cell samples from NOA patients indicated that DDX20 and NCBP2 were associated with spermatogenesis (false discovery rate < 0.05). Based on the transcriptome expressions of DDX20 and NCBP2, we constructed multiple diagnosis models using logistic regression, random forest, and artificial neural network (ANN). The ANN model exhibited the most reliable predictive performance in the training cohort (AUC = 0.840), GSE45885 (AUC = 0.731), GSE45887 (AUC = 0.781), GSE145467 (AUC = 0.850), and local cohort (AUC = 0.623). Totally, an ANN diagnosis model based on RBP DDX20 and NCBP2 was developed and externally validated in NOA, functioning as a promising tool in clinical practice.