Research Paper Volume 12, Issue 22 pp 22457—22494
Generalized correlation coefficient for genome-wide association analysis of cognitive ability in twins
- 1 Epidemiology, Biostatistics and Biodemography, Department of Public Health, University of Southern Denmark, Odense, Denmark
- 2 Department of Epidemiology and Health Statistics, School of Public Health, Qingdao University, Qingdao, China
- 3 Qingdao Center for Disease Control and Prevention, Qingdao, China
- 4 Department of Clinical Immunology, Copenhagen University Hospital, Rigshospitalet, Copenhagen Ø, Denmark
- 5 Computational Biomedicine, Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
- 6 Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
- 7 Unit of Human Genetics, Department of Clinical Research, University of Southern Denmark, Odense, Denmark
Received: July 20, 2020 Accepted: September 29, 2020 Published: November 24, 2020https://doi.org/10.18632/aging.104198
How to Cite
Copyright: © 2020 Mohammadnejad et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Despite a strong genetic background in cognitive function only a limited number of single nucleotide polymorphisms (SNPs) have been found in genome-wide association studies (GWASs). We hypothesize that this is partially due to mis-specified modeling concerning phenotype distribution as well as the relationship between SNP dosage and the level of the phenotype. To overcome these issues, we introduced an assumption-free method based on generalized correlation coefficient (GCC) in a GWAS of cognitive function in Danish and Chinese twins to compare its performance with traditional linear models. The GCC-based GWAS identified two significant SNPs in Danish samples (rs71419535, p = 1.47e-08; rs905838, p = 1.69e-08) and two significant SNPs in Chinese samples (rs2292999, p = 9.27e-10; rs17019635, p = 2.50e-09). In contrast, linear models failed to detect any genome-wide significant SNPs. The number of top significant genes overlapping between the two samples in the GCC-based GWAS was higher than when applying linear models. The GCC model identified significant genetic variants missed by conventional linear models, with more replicated genes and biological pathways related to cognitive function. Moreover, the GCC-based GWAS was robust in handling correlated samples like twin pairs. GCC is a useful statistical method for GWAS that complements traditional linear models for capturing genetic effects beyond the additive assumption.
SNPs: single nucleotide polymorphisms; GWASs: genome-wide association studies; GCC: generalized correlation coefficient; MIC: maximal information coefficient; MINE: maximal information-based nonparametric exploration; LD: linkage disequilibrium; AD: Alzheimer’s disease; FDR: false discovery rate; DZ: dizygotic; MZ: monozygotic; MADT: Middle-Aged Danish Twins; MoCA: Montreal Cognitive Assessment; MAF: minor allele frequency; HWE: Hardy-Weinberg equilibrium; LME: mixed-linear model; GSEA: Gene-set enrichment analysis.