Research Paper Advance Articles

DNAmFitAge: biological age indicator incorporating physical fitness

Kristen M. McGreevy1, *, , Zsolt Radak2, *, , Ferenc Torma2, , Matyas Jokai2, , Ake T. Lu3, , Daniel W. Belsky4, , Alexandra Binder5, , Riccardo E. Marioni6, , Luigi Ferrucci7, , Ewelina Pośpiech8,9, , Wojciech Branicki10, , Andrzej Ossowski9, , Aneta Sitek11, , Magdalena Spólnicka12, , Laura M. Raffield13, , Alex P. Reiner14, , Simon Cox15, , Michael Kobor16, , David L. Corcoran17, , Steve Horvath1,3, ,

  • 1 Department of Biostatistics, Fielding School of Public Health, University of California Los Angeles, Los Angeles, CA 90095, USA
  • 2 Research Institute of Sport Science, University of Physical Education, Budapest, Hungary
  • 3 San Diego Institute of Science, Altos Labs, San Diego, CA 92121, USA
  • 4 Department of Epidemiology and Butler Columbia Aging Center, Columbia University Mailman School of Public Health, New York, NY 10032, USA
  • 5 Department of Cancer Epidemiology, University of Hawaii, Honolulu, HI 96813, USA
  • 6 Centre for Genomic and Experimental Medicine, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh EH4 2XU, UK
  • 7 Longitudinal Studies Section, Translational Gerontology Branch, National Institute on Aging, National Institutes of Health, Baltimore, MD 21224, USA
  • 8 Małopolska Centre of Biotechnology, Jagiellonian University, Kraków, Poland
  • 9 Department of Forensic Genetics, Pomeranian Medical University in Szczecin, Szczecin, Poland
  • 10 Institute of Zoology and Biomedical Research, Faculty of Biology, Jagiellonian University, Kraków, Poland
  • 11 Department of Anthropology, Faculty of Biology and Environmental Protection, University of Łódź, Łódź, Poland
  • 12 Central Forensic Laboratory of the Police, Warsaw, Poland
  • 13 Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
  • 14 Department of Epidemiology, University of Washington, Seattle, WA 98195, USA
  • 15 Lothian Birth Cohorts, Department of Psychology, University of Edinburgh, Edinburgh EH8 9JZ, UK
  • 16 Department of Medical Genetics, University of British Columbia, Vancouver, Canada
  • 17 Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
* Co-first authors

Received: July 12, 2022       Accepted: January 23, 2023       Published: February 22, 2023
How to Cite

Copyright: © 2023 McGreevy et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.


Physical fitness is a well-known correlate of health and the aging process and DNA methylation (DNAm) data can capture aging via epigenetic clocks. However, current epigenetic clocks did not yet use measures of mobility, strength, lung, or endurance fitness in their construction. We develop blood-based DNAm biomarkers for fitness parameters gait speed (walking speed), maximum handgrip strength, forced expiratory volume in one second (FEV1), and maximal oxygen uptake (VO2max) which have modest correlation with fitness parameters in five large-scale validation datasets (average r between 0.16–0.48). We then use these DNAm fitness parameter biomarkers with DNAmGrimAge, a DNAm mortality risk estimate, to construct DNAmFitAge, a new biological age indicator that incorporates physical fitness. DNAmFitAge is associated with low-intermediate physical activity levels across validation datasets (p = 6.4E-13), and younger/fitter DNAmFitAge corresponds to stronger DNAm fitness parameters in both males and females. DNAmFitAge is lower (p = 0.046) and DNAmVO2max is higher (p = 0.023) in male body builders compared to controls. Physically fit people have a younger DNAmFitAge and experience better age-related outcomes: lower mortality risk (p = 7.2E-51), coronary heart disease risk (p = 2.6E-8), and increased disease-free status (p = 1.1E-7). These new DNAm biomarkers provide researchers a new method to incorporate physical fitness into epigenetic clocks.


Physical fitness declines with aging and is well known to correlate to health [1]. This decline is evident in reduced function in specific organs, like lungs [2], and in performance tests of strength [3] or aerobic capacity [4]. The rate of this decline varies between individuals [5, 6], and those who preserve physical fitness as they age are at lower risk for a range of diseases and tend to live longer lives [68]. At the molecular level, changes in fitness and related indices of functional capacity correlate with changes in molecular signs of decline thought to reflect underlying biological processes of aging [9]. Measures of fitness may therefore provide a new window into biological aging [10]. However, direct measurement of fitness parameters can be challenging, requiring in-person data collection by trained personnel with specialized equipment [11]. Furthermore, fitness measurements are not possible for studies with remote data collection or those conducted with stored biospecimens. To enable such studies to quantify fitness, we developed blood based DNAm biomarkers of fitness parameters spanning mobility, strength, lung function, and cardiovascular fitness and use these to construct a novel indicator of fitness-based biological age, DNAmFitAge.

Three lines of evidence support a focus on DNAm to develop biomarkers of fitness and aging-related changes in fitness. First, aging is reflected in DNAm changes; hundreds of thousands of CpG sites across the genome change methylation states as organisms grow older, enabling construction of high-precision algorithms to predict age [12, 13]. These are collectively known as epigenetic clocks, and a large body of literature demonstrates these clocks are associated with human mortality risk [14, 15], various age-related conditions [1517], and are reflective of one’s biological age [14, 17]. Second, prediction of aging-related morbidity, disability, and mortality by DNAm biomarkers is enhanced by the incorporation of physiological data, like smoking pack years and white blood cell counts, into prediction algorithms [14, 15, 18]. This suggests utility in including physical fitness in DNAm biomarkers, however, current DNAm biomarkers do not use fitness parameters in their construction. Third, there is emerging evidence that epigenetic clocks are sensitive to lifestyle factors [19], individual differences in fitness parameters are reflected in DNAm data [20, 21], and blood DNAm differs between athletes and controls [22]. Therefore, a growing body of evidence suggests blood DNAm carries information related to physical fitness, but it was unknown if fitness parameters could be estimated using blood DNAm levels.

Here, we develop blood DNAm biomarkers of four fitness parameters: gait speed (walking speed), maximum handgrip strength, forced expiratory volume in 1 second (FEV1; an index of lung function), and maximal oxygen uptake (VO2max; a measure of cardiorespiratory fitness). These parameters were chosen because handgrip strength and VO2max provide insight into the two main categories of fitness: strength and endurance [23], and gait speed and FEV1 provide insight into fitness-related organ function: mobility and lung function [8, 24]. Furthermore, each parameter is commonly measured and known to be associated with aging, mortality, and disease [8, 2426]. We then use these DNAm fitness biomarkers to develop the novel DNAm fitness-related biological age indicator, DNAmFitAge, which quantifies the relationship between physical fitness and biological aging processes. This novel measure incorporates mortality risk with strength, mobility, and cardiovascular fitness using blood DNAm biomarkers. Our newly constructed DNAm biomarkers and DNAmFitAge provide researchers and physicians a new method to incorporate physical fitness into epigenetic clocks and emphasizes the effect lifestyle has on the aging methylome.


DNAm fitness parameter biomarker models

The DNAm fitness parameter biomarkers built with blood DNA methylation had modest correlation with direct fitness parameters. Average correlations across validation datasets ranged from 0.16–0.48 (Figure 1, Table 1). DNAmGripmax in males and females had moderate correlations in validation datasets but do not perform well in CALERIE. We hypothesize this may be due in part from the stringent enrollment criteria: free of chronic disease, non-obese, and relatively young, which yields less variation in fitness measures. Correlation of DNAmVO2max to FEV in LBC1921 and LBC1936 was weak within each sex, likely caused by the small age range in each cohort; however varying correlations between FEV and VO2max have also been described in literature [2729]. Reported correlations between VO2max and FEV vary from 0 to 0.5, likely because VO2max is a measure of cardiovascular health whereas FEV is a measure of lung volume. Correlation of DNAmVO2max to VO2max in CALERIE, the one validation dataset with the same direct fitness parameter, has good correlation overall and within sex (overall R = 0.55, female R = 0.19, male R = 0.47).

Scatterplots of DNAm fitness biomarker models versus true values in test datasets. Pink indicates females, and blue indicates males. When original variables were unavailable, best alternative variables are plotted against the DNAm fitness estimates. Each panel corresponds to the performance of one DNAm-based model built with chronological age across test datasets displayed with Pearson correlation and p-values. (A) DNAmGaitspeed with performance in InChianti dataset displayed, (B) DNAmGripmax with performance in WHI dataset, (C) DNAmFEV1, (D) DNAmVO2max. (A–C) (DNAmGaitspeed, DNAmGripmax, and DNAmFEV1) were built in each sex separately while (D) (DNAmVO2max) was built in both sexes jointly.

Figure 1. Scatterplots of DNAm fitness biomarker models versus true values in test datasets. Pink indicates females, and blue indicates males. When original variables were unavailable, best alternative variables are plotted against the DNAm fitness estimates. Each panel corresponds to the performance of one DNAm-based model built with chronological age across test datasets displayed with Pearson correlation and p-values. (A) DNAmGaitspeed with performance in InChianti dataset displayed, (B) DNAmGripmax with performance in WHI dataset, (C) DNAmFEV1, (D) DNAmVO2max. (AC) (DNAmGaitspeed, DNAmGripmax, and DNAmFEV1) were built in each sex separately while (D) (DNAmVO2max) was built in both sexes jointly.

Table 1. DNAm fitness parameter biomarker Pearson correlation.

DNAm biomarkerCpGAge in modelSexFHS + BLSABudapestLBC 1921LBC 1936CALERIEInChiantiWHIAverage test R
Gait speed42YFemales0.610.610.370.340.11*0.49+0.15+0.34
Superscripts indicate correlation is with the closest fitness parameter available: 'Jumpmax, *Composite Leg Strength, +Physical Functioning, ^FEV, $FEV1, vVO2max. Gray italics denote training dataset.

The DNAm biomarkers improve estimation of fitness parameters beyond what is explained through age and sex in many validation datasets (Supplementary Table 1). Table 1 shows between 26 and 93 CpG loci were selected through LASSO to estimate each fitness parameter. Without age as a covariate in the DNAm biomarker estimates, more CpG loci were needed to achieve similar precision- between 53 and 93. DNAmVO2max model includes several CpG loci on the X chromosome, likely capturing sex effects. Interestingly, DNAmGaitspeed and DNAmGripmax built without chronological age have lower correlation with true fitness parameters compared to the models built with chronological age, however these biomarkers explain more additional variation in fitness parameters compared to the age-included versions. This suggests the DNAm biomarkers capture different information than age and sex for understanding fitness parameters. R code to calculate DNAm fitness biomarkers is available in our GitHub repository at

DNAm fitness biomarkers in age-related conditions

All DNAm fitness biomarkers are individually predictive of mortality and disease-free status, and some are predictive of type 2 diabetes status and number of comorbidities in the validation datasets. After controlling for age and sex, higher (or more fit) values of DNAmGaitspeed without age (p = 1.1E-10), DNAmGripmax without age (p = 2.6E-9), DNAmFEV1 (p = 2.2E-20), and DNAmVO2max (p = 0.003) are associated with decreasing mortality risk (Supplementary Figure 1). For example, on average, every 1 kg stronger DNAmGripmax is has an associated 5% decrease in mortality risk compared to a person of the same age and sex (hazard ratio = 0.95, confidence interval = [0.93, 0.96]). DNAmGaitspeed and DNAmFEV1 are both predictive of type 2 diabetes status (p = 0.0013, p = 0.0032) and number of comorbidities (p = 0.0004, p = 4E-12). Stronger values of any DNAm fitness biomarkers are associated with disease-free status. Relationship of each DNAm fitness biomarker with time-to-death, type 2 diabetes, number of comorbidities, and disease-free status after adjusting for age and sex are displayed in Supplementary Figure 1. Relationship of DNAm biomarkers to physical activity are explored alongside DNAmFitAge below.


DNAmFitAge provides an estimate of biological age, and FitAgeAcceleration is a measure of epigenetic age acceleration. DNAmFitAge had strong correlation to chronological age in validation datasets. The average Pearson r between DNAmFitAge and chronological age across validation datasets was 0.77 (Figure 2), and the lower correlation in LBC1921 (r = 0.38) and LBC1936 (r = 0.68) can be attributed to the small age range they cover. LBC1921 ages ranged from 77 to 90 and LBC1936 ages ranged from 67 to 80. The average r excluding LBC cohorts was 0.92. The DNAm fitness biomarkers contribute 319 unique CpG loci to construct DNAmFitAge, and the contribution among DNAm fitness biomarkers were very similar in males (13.9–17.9%) with slightly more variation in females (10.4–22.4%). DNAmGripmax, DNAmVO2max, and DNAmGaitspeed contributed around 50% to estimating DNAmFitAge in each sex, and DNAmGrimAge contributed the remaining 50% (Table 2A). In addition, each validation dataset had low median absolute deviation (median of the absolute difference from chronological age to biological age) ranging from 2.3 to 4.9 years (Supplementary Table 2). Reproducibility across a wide span of ages (21 in CALERIE to 100 in InChianti) demonstrate DNAmFitAge’s calibration across a wide adult age range.

Scatterplots of DNAmFitAge versus age separated by sex. Pink indicates females, and blue indicates males. (A–F) Each panel corresponds to the performance of DNAmFitAge in one validation dataset displayed with Pearson correlation to chronological age and corresponding p-values. DNAmFitAge models applied to the same sex it was built in (i.e., DNAmFitAge built for females tested in females and DNAmFitAge built for males tested in males). DNAmFitAge is centered on chronological age with high correlation across all test sets.

Figure 2. Scatterplots of DNAmFitAge versus age separated by sex. Pink indicates females, and blue indicates males. (AF) Each panel corresponds to the performance of DNAmFitAge in one validation dataset displayed with Pearson correlation to chronological age and corresponding p-values. DNAmFitAge models applied to the same sex it was built in (i.e., DNAmFitAge built for females tested in females and DNAmFitAge built for males tested in males). DNAmFitAge is centered on chronological age with high correlation across all test sets.

Table 2A. DNAmFitAge model weights.

VariableFemale weightsMale weights

Applying each DNAmFitAge model to the opposite sex shows strong correlation with age but with substantial over and underestimation of age in females and males, respectively (Supplementary Table 2, panels B–H Supplementary Figure 2). Over and under estimation are explained by universal differences in male and female fitness parameters. Females tend to have lower fitness parameters compared to males. Hence males were predicted to be younger than they are using the female DNAmFitAge model because larger values of DNAmGaitSpeed, DNAmGripmax, or DNAmVO2max indicates stronger (or more physically fit) females.

Reference DNAm fitness biomarker values corresponding to fit DNAmFitAge compared to unfit DNAmFitAge within age and sex categories are provided in Table 2B. “Fit” corresponds to biological age being 5 years younger than expected (FitAgeAcceleration ≤ −5), and “unfit” corresponds to biological age being 5 years older than expected. Average differences (Fit – Unfit) across all age and sex categories are 0.2 m/s faster DNAmGaitSpeed, 5.1 kg stronger DNAmGripmax, and 2.0 mL/kg/min better DNAmVO2max. Overall, higher or more physically fit values of DNAmGaitspeed, DNAmGripmax, or DNAmVO2max correspond to younger estimated biological ages in males and females.

Table 2B. Reference DNAm fitness parameter values for fit (FitAge acceleration <= −5 yrs) and unfit (FitAge Acceleration >= +5 yrs) individuals.


FitAgeAcceleration in age-related conditions

We find that the age-adjusted version of FitAge, FitAgeAcceleration, is a significant predictor of mortality risk (all cause mortality), coronary heart disease, and other age-related conditions. Cox Proportional Hazard models demonstrated FitAgeAcceleration is a strong predictor for time-to-death (p = 7.2E-51) and time-to-coronary heart disease (p = 2.6E-8). FitAgeAcceleration had an overall hazard ratio of 1.07 (1.06, 1.08) (Figure 3). Thus, a FitAgeAcceleration value of 10 years was associated with almost doubling the mortality risk compared to the average person of the same age and sex (1.0710 = 1.97 risk). Similarly, increase in FitAgeAcceleration corresponds to more comorbidities (p = 9.0E-9), hypertension (p = 8.7E-5), and earlier age at menopause (p = 6.6E-9) (Figure 3, Supplementary Table 3). A lower FitAgeAcceleration was associated with disease free status (p = 1.1E-7) and lower cholesterol (p = 0.0005) (Supplementary Table 3).

Meta-analysis forest plots for FitAgeAcceleration to age-related conditions adjusted for age and sex. Each panel reports a meta analysis forest plot for combining hazard ratios or regression coefficients across dataset cohorts. (A) Time-to-death with number of events, (B) time-to-coronary heart disease with number of events, (C) type 2 diabetes, (D) comorbidity count, and (E) disease free status. Meta-analysis p-values are displayed in the header of each panel, and test of heterogeneity Cochran Q test p-value (Het. P) are displayed for fixed effect models. Fixed effects models were used for (A–C) and Stouffer’s method was used for (D, E).

Figure 3. Meta-analysis forest plots for FitAgeAcceleration to age-related conditions adjusted for age and sex. Each panel reports a meta analysis forest plot for combining hazard ratios or regression coefficients across dataset cohorts. (A) Time-to-death with number of events, (B) time-to-coronary heart disease with number of events, (C) type 2 diabetes, (D) comorbidity count, and (E) disease free status. Meta-analysis p-values are displayed in the header of each panel, and test of heterogeneity Cochran Q test p-value (Het. P) are displayed for fixed effect models. Fixed effects models were used for (AC) and Stouffer’s method was used for (D, E).

Each of these associations were in the expected direction, as someone who had a low FitAgeAcceleration had a biological age estimate that was younger than their chronological age. Hence, people whose DNAm predicted them to be more ‘physically fit’ than their chronological age would suggest had better age-related outcomes. These relationships demonstrate epigenetic age acceleration can be well explained through DNAm fitness parameter biomarkers, and that FitAgeAcceleration provides a practical tool for relating fitness to the aging process.

FitAgeAcceleration is additionally informative for mortality risk beyond the information captured with AgeAccelGrim in JHS females and in InChianti males and females when comparing LRT p-values (Supplementary Table 4). FitAge Acceleration is almost always additionally informative for time-to-death compared to other epigenetic clocks, but FitAgeAcceleration is only sometimes informative beyond the epigenetic clocks for explaining number of comorbidities. Overall, our results indicate FitAgeAcceleration is informative for mortality risk and may act as a supplement (not replacement) to AgeAccelGrim.

DNAmFitAge relationship to physical activity

FitAgeAcceleration, DNAmGaitspeed, DNAmGripmax, and DNAmFEV1 have associations in the expected direction with physical activity in low to intermediate physically active individuals. Coefficients indicate the effect on physical activity for a one unit increase in each DNAm fitness biomarker after adjusting for chronological age within each sex (Table 3, Figure 4). The relationship to DNAmFitAge is as expected; someone with a higher FitAgeAcceleration has an estimated biological age that is older than expected, which corresponds to lower physical activity or physical functioning (Table 2B). Similarly, men and women with a faster DNAmGaitspeed, stronger DNAmGripmax, and larger DNAmFEV1 are more physically active when holding age constant. In conclusion, men and women who were more active showed correspondingly ‘fitter’ values of FitAgeAcceleration and the DNAm fitness biomarkers. Research suggests any exercise compared to none is beneficial to health [30], and we hope DNAmFitAge may serve as a tool to motivate starting an exercise regimen at any level.

Table 3. Association of DNAm biomarkers to physical activity and physical functioning in people with low to intermediate activity levels.

OutcomeFemalesMalesMeta analysis p-value
LBC 1921LBC 1936InChiantiJHSWHILBC 1921LBC 1936InChiantiJHS
DNAmGaitSpeed w/ Agecoefficient−0.512.828.763.0726.67−3.311.994.971.781.82E-03
DNAmGaitSpeed w/o Agecoefficient0.870.901.320.408.77−0.100.992.361.491.60E-06
DNAmGripmax w/Agecoefficient0.10−0.060.14-
DNAmGripmax w/o Agecoefficient0.0760.0350.100.0020.30−
Meta-analysis forest plots for DNAmFitAge and DNAm fitness parameters relationship to physical activity or physical functioning in people with low to intermediate physical activity. Each panel reports the Stouffer’s meta-analysis p-value for combining coefficients across dataset cohorts after adjusting for chronological age. (A) DNAmFitAge, (B) DNAmGaitspeed, (C) DNAmGripmax, (D) DNAmFEV1, and (E) DNAmVO2max. DNAmFitAge, DNAmGaitSpeed, DNAmGripmax, and DNAmFEV1 are predictive of physical activity in low to intermediate physically active individuals.

Figure 4. Meta-analysis forest plots for DNAmFitAge and DNAm fitness parameters relationship to physical activity or physical functioning in people with low to intermediate physical activity. Each panel reports the Stouffer’s meta-analysis p-value for combining coefficients across dataset cohorts after adjusting for chronological age. (A) DNAmFitAge, (B) DNAmGaitspeed, (C) DNAmGripmax, (D) DNAmFEV1, and (E) DNAmVO2max. DNAmFitAge, DNAmGaitSpeed, DNAmGripmax, and DNAmFEV1 are predictive of physical activity in low to intermediate physically active individuals.

Additionally, DNAmFitAge (Stouffer p-value = 6.4E-13) marginally outperforms current DNAm biomarkers when comparing meta-analysis p-values; improvement of DNAmFitAge compared to DNAmGrimAge (p-value = 1.2E-12) is marginal, however the improvement compared to DNAmPhenoAge (p-value = 1.3E-6) and DNAmGDF-15 (p-value = 6.2E-8) is more pronounced. In addition, DNAmFitAge, which provides an indicator of biological age, may provide a more interpretable aging biomarker compared to DNAmGrimAge, which provides a measurement of lifespan. These comparisons demonstrate DNAmFitAge can capture the relationship to physical activity and can provide an improvement to the arsenal of current DNAm biomarkers.

DNAmFitAge relationship in body builders

Male body builders are estimated to be biologically younger and more physically fit compared to male controls of the same age. On average, DNAmFitAge is 2.74 years younger in male body builders compared to controls (p = 0.041), and DNAmVO2max is 0.4 mL/kg/sec better in male body builders (p = 0.023) (Table 4). FitAge Acceleration (p = 0.080), DNAmGaitspeed (p = 0.055), and DNAmGripmax (p = 0.075) are suggestive of having improvement in male body builders, however they were not significant at the 0.05 level. Boxplots displaying the spread of DNAmFitAge, DNAmVO2max, FitAge Acceleration, and DNAmGaitspeed between body builders and controls are presented in Figure 5. Male body builders have 5.4 more years of regular training (p = 2.6E-6) and 1.1 more training sessions per week (p = 9.4E-7) compared to male controls on average, and the DNAmFitAge and DNAmVO2max results correspond to male body builders being estimated as more physically fit, as expected. The study was underpowered for females with only 30 female body builders, however we did examine the relationship of the DNAm fitness biomarkers in females and expectedly (due to the small sample size) did not find a significant difference between female body builders and controls. Our promising results in male body builders show a physically fit lifestyle corresponds to biological aging benefits that can be captured with our new DNAm fitness biomarkers and DNAmFitAge.

Table 4. Comparison between male controls and body builders in Polish study.

Mean controlMean body builderControl - body builderKruskal wallis p-value
(n = 149)(n = 66)
Intensity trainings per week34.1−1.19.43E-07
Years regular training6.612−5.42.61E-06
Boxplots showing spread of DNAm biomarkers between male controls (n = 149) and male body builders (n = 66) in the Polish study. (A) DNAmFitAge is younger on average in the male body builders, (B) DNAmVO2max is fitter on average in the male body builders, (C) FitAge Acceleration and (D) DNAmGaitspeed are suggestively improved in body builders but not significantly different at 0.05.

Figure 5. Boxplots showing spread of DNAm biomarkers between male controls (n = 149) and male body builders (n = 66) in the Polish study. (A) DNAmFitAge is younger on average in the male body builders, (B) DNAmVO2max is fitter on average in the male body builders, (C) FitAge Acceleration and (D) DNAmGaitspeed are suggestively improved in body builders but not significantly different at 0.05.

Dietary supplement use cannot explain improvement in DNAmFitAge, but multivitamin dietary supplements are associated with improvement in DNAmVO2max after controlling for athlete status and age in males. Males from the Polish Study who take multivitamins have a 0.68 mL/kg/sec fitter DNAmVO2max on average after adjusting for athlete status and age (p = 0.041, Supplementary Table 5). Multivitamins, energy, vitamin D, and Omega-3 all are disproportionately taken by the male body builders (Supplementary Table 6), however, supplement use is not sufficient to explain younger DNAmFitAge regardless of athlete status (Supplementary Table 5). These insignificant results may point to other components of athleticism that contribute to younger estimated biological ages, such as increased physical activity and decreased body fat. We note that supplement and athlete coefficients for multivitamins, proteins, and Omega-3 are statistically insignificant, but their relationships are in the expected direction for DNAmFitAge and DNAmVO2max. Our research does not establish the causative relationship of body building or supplement use on biological aging, but it does establish there are observable epigenetic benefits associated with being a male body builder.

Functional CpG annotation

The 627 CpG genomic locations used to construct the DNAm biomarker estimates were enriched in 5 gene sets, 11 cellular processes, and 7 molecular processes mostly related to inflammation at FDR Q-value < 0.05. Top enrichment results from GREAT analysis passing the Bonferroni p-value threshold of 0.05 are presented in Table 5A and complete GREAT results are presented in Supplementary Table 7. The top genes enriched include zinc ribbon domain containing 1 (ZNRD1; Bonferroni p = 0.005) and histocompatibility antigen (HLA-G; p = 0.02). Cellular processes relate to major histocompatibility complex (MHC) proteins (p = 3.1E-7) and molecular processes relate to peptide antigen binding (p = 0.032) and tapasin binding (p = 0.047). Tapasin is a MHC class I antigen-processing molecule present in the lumen of the endoplasmic reticulum [31]. The relationship to inflammation-based genes and processes like HLA, MHC, and tapasin support hypotheses relating physical fitness and systemic inflammation [32]. In addition, previous research found inflammation response and endoplasmic reticulum stress were down-regulated in people following a 12-week endurance exercise regime compared to the non-exercising control group [33]. Both biological findings are intriguing and may provide direction for studying modifiable methylation from fitness parameters.

Table 5A. Top GREAT CpG annotation results.

Observed regionsFold enrichmentBinomial p-valueBonferroni p-value
 MHC protein complex925.11.86E-103.11E-07
 Integral component of endoplasmic reticulum membrane213.74.49E-070.00075
 Intrinsic component of endoplasmic reticulum membrane213.66.39E-070.0011
MHC class II protein complex526.91.56E-060.0026
 Integral component of lumenal side of Endoplasmic reticulum membrane712.71.81E-060.0030
 Peptide antigen binding613.37.71E-060.032
 Tapasin binding2421.01.12E-050.047

Next, we examined the chromatin states of the genomic regions across the 627 CpG sites used for DNAm fitness biomarker construction and found CpG loci are significantly depleted in heavily acetylated promoters and transcription start sites (TSS) and enriched in regions with polycomb repressive complex 2 (PRC2) binding. The odds ratios (OR) are significantly less than one in the chromatin state PromF4 (heavily acetylated promoters, OR = 0.45, hypergeometric p = 6.5E-6) and TSS1 (acetylated TSS, OR = 0.37, p = 6.8E-6) (Table 5B). BivProm1 (OR = 1.50, p = 0.009), BivProm2 (OR = 1.76, p = 0.0006), and ReprPC1 (OR = 1.87, p = 0.007) regions are enriched in our DNAm fitness biomarkers and are known PRC2 binding sites [34]. BivProm1 and BivProm2 are weak bivalent promoters and ReprPC1 is a polycomb repressed region. Bivalent chromatin domains control expression of HOX and other developmental genes in all vertebrates. PRC2 is one of the main Polycomb repressive complexes (PRC) that act as negative epigenetic regulators of transcription; it helps to initiate gene silencing via H3K27 methylation [35]. These results coincide with the increasing observation that the process of development is connected to epigenetic aging and that PRC2 targets are enriched in the age-dependent methylome in human and mammals [12, 36].

Table 5B. Chromatin state enrichment.

StateDescriptionNumber of CpG lociOdds ratioHypergeometric p-value
PromF4Promoter; heavily acetylated - flanking tss downstream bias250.456.5E-06
TSS1TSS more acetylated and active150.376.8E-06
BivProm2Weak bivalent promoter- stronger on H3K27me3431.760.00057
TxEx3Exon; H3K36me3 strong40.300.0030
DNase1DNase I only132.410.0041
ReprPC1Polycomb repressed; H3K27me3 strong and H3K4me1 weak211.870.0065
BivProm1Weak bivalent promoter - more balanced H3K4me3/H3K27me3431.500.0092

Approximately 10% of CpG sites used to construct DNAm fitness biomarkers are conserved in other epigenetic clocks with 25% of the coefficients in the same direction. Fifty-six (out of 627) CpG sites are conserved in at least one other epigenetic clock; 7 in DNAmPhenoAge, 2 in DNAmAge, 15 in DNAmAgeHannum, 23 in DNAmAgeSkinBlood (Supplementary Table 8), and 14 in DNAmGrimAge. The most conserved CpG site was cg26842024; this is used in the male DNAmGaitspeed model and is in all clocks except Hannum. CpG sites were chosen in multiple DNAm fitness models corresponding to 46 coefficients to compare to other epigenetic clocks. In total, 11 coefficients were in the same direction as other clocks. The remaining 90% of the CpG sites used for DNAm fitness biomarkers suggest new areas of the epigenome to study that may be responsive to physical activity.


DNAm biomarkers have been constructed for blood cell count [37], age [12, 13], smoking [15], and more, however, there were not yet DNAm biomarkers for fitness parameters. Our work introduces new DNAm biomarkers for the fitness parameters of maximum handgrip strength, gait speed, FEV1, and VO2max. These DNAm biomarkers represent new tools for researchers interested in studying the epigenetic components to physical fitness.

DNAm biomarkers have been improved by incorporating phenotypic information [14, 15], however, DNAm biomarkers had not yet incorporated physical fitness. DNAmFitAge provides researchers a novel indicator of biological age which combines physical fitness and epigenetic health. This biomarker integrates the established DNAm prediction of mortality risk (DNAmGrimAge) with the newly developed DNAm predictions of fitness. Higher values of DNAmGaitspeed, DNAmGripmax, DNAmFEV1, and DNAmVO2max, which reflect greater physical fitness, correspond to younger estimated biological ages in men and women. We demonstrate physically fit lifestyles have younger biological ages and fitter DNAm fitness biomarkers, which we observe in people of low to intermediate physical activity levels across five large-scale validation datasets and in male body builders who have intense, athletic exercise regimes. Furthermore, FitAgeAcceleration is strongly associated with a host of age-related conditions and predicts time-to-death and time-to-CHD across validation datasets. FitAgeAcceleration provides a novel measure of epigenetic age acceleration that is expected to be particularly sensitive to exercise interventions.

We acknowledge the following limitations. First, the DNAm fitness parameter biomarkers lead to only modest improvement to estimate fitness parameters after including age and sex as covariates in validation datasets. This reflects the relatively weak signal present in blood for fitness parameters. Because of the biomarkers’ limited correlation, DNAm fitness biomarkers should not replace true fitness parameters. Instead, the main benefit of our biomarkers is that they show blood epigenetic changes accompany physical fitness. These biomarkers advance the molecular understanding of exercise benefits, which we hypothesize to be most pronounced in athletic populations as illustrated in our analysis of body builders. The male body builders had a mean 2.7 year reduction in DNAmFitAge compared to controls, whereas the intermediate physically active people had at most a mean 0.33 year reduction in DNAmFitAge (WHI). Second, our DNAmVO2max biomarker was only validated in one dataset with VO2max; more research is needed to evaluate how our DNAmVO2max biomarker performs across a range of independent datasets.

Overall, DNAmGaitspeed, DNAmGripmax, DNAmFEV1, DNAmVO2max, and DNAmFitAge provide epigenetic components to evaluating a person’s physical fitness. Physically fit people have a younger DNAmFitAge and younger FitAgeAcceleration, and younger values are associated with more physical activity and better age-related outcomes. Our research suggests exercise and stronger fitness parameters are protective to DNAmFitAge in both sexes. We expect DNAmFitAge will be a useful biomarker for quantifying fitness benefits at an epigenetic level and can be used to evaluate exercise-based interventions.


Study cohorts

We analyzed blood DNAm data from three datasets, Framingham Heart Study Offspring cohort (FHS, n = 1830), Baltimore Longitudinal Study on Aging (BLSA, n = 820), and novel data (Budapest, n = 307) to develop DNAm biomarkers of fitness parameters. In short, the FHS cohort is a cardiovascular study which followed adults from Massachusetts starting in 1948 [38]. The BLSA cohort began in 1958 studying healthy adults and the aging process [39]. Finally, Budapest is a smaller study (n = 307) measuring physical fitness and DNA methylation in middle to older aged adults, some of whom are current or former athletes. More details of the Budapest study can be found in Supplementary Note 1. Dataset harmonization was performed to join multiple datasets when variables were on different scales following previously developed methods [40]. In brief, datasets were rescaled to have the same mean and standard deviation for each fitness parameter by recentering and multiplying by the ratio of standard deviations.

We conducted validation analysis in an independent group of six additional datasets: two Lothian Birth Cohorts: LBC1921 (n = 692) and LBC1936 (n = 2797), Comprehensive Assessment of Long-term Effects of Reducing Intake of Energy (CALERIE, n = 578), InChianti (n = 924), Jackson Heart Study (JHS, n = 1746), and Women’s Health Initiative (WHI, n = 2117). Descriptive statistics of each dataset are presented in Supplementary Table 9. Full study descriptions for validation datasets have previously been published [4146]. We evaluate our biomarkers in a novel Polish study which collects DNA methylation and dietary supplements in body builders and controls to assess performance in an athletic population. Additional details of the Polish study can be found in Supplementary Note 1.

DNAm fitness parameter biomarker development

We developed DNAm biomarkers for four fitness parameters: gait speed, maximum handgrip strength (Gripmax), forced expiratory volume in 1 second (FEV1), and maximal oxygen uptake (VO2max). Gait speed, also known as walking speed, is measured in meters per second [47]. Maximum hand grip strength is a measurement of force taken in kg [3]. FEV1 measures lung function; it is the amount of air forced from the lungs in one second, measured in liters [8]. VO2max is a measure of cardiovascular health and aerobic endurance [4, 47]. It measures the volume of oxygen the body processes during incremental exercise in milliliters used in one minute of exercise per kilogram of body weight (mL/kg/min). VO2max has been regarded as the best indicator of an athlete’s physical capacity and is the international standard of physical capacity [47].

Each fitness DNAm biomarker was developed using LASSO penalized regression with 10-fold cross validation in which the fitness parameters were dependent variables and independent variables were DNAm levels at cytosine-phosphate-guanines (CpG) sites and chronological age. The LASSO-regression method uses an l1 penalty that shrinks each coefficient towards zero. LASSO is more effective than Ridge (l2 penalty) and elastic net (mixture of l1 and l2 penalty) when handling many irrelevant predictors and yields smaller number of predictors in the final model. Models were fit separately for men and women in the case of gait speed, gripmax, and FEV1 to select for sex specific CpG loci that reflect gender variation in fitness. The selected covariates and estimated coefficients were then used to form a prediction algorithm for each fitness parameter. We refer to the predicted fitness parameters generated by these algorithms as DNAmGaitspeed, DNAmGripmax, DNAmFEV1, and DNAmVO2max. Correlation of each DNAm biomarker with measured fitness values in the training data are displayed in Supplementary Figure 3.

When it came to building the biomarker for VO2max, stratifying by sex was not feasible due to the smaller sample size. This forced us to choose between using sex as a covariate or omitting sex and trusting LASSO to select X chromosome markers that best signify differences between males and females. We chose the latter, and it did. Finally, we present two models for DNAmGaitspeed and DNAmGripmax; one with chronological age and one without chronological age as potential covariates. Removing age as a potential variable for selection in LASSO was performed to remedy high collinearity discovered among these DNAm biomarkers when constructing DNAmFitAge (scatterplot matrix in Supplementary Figure 4).

DNAm fitness parameter biomarker validation

We conducted two validation analyses of DNAm biomarkers of fitness parameters using up to five independent datasets. First, we correlated DNAm biomarker values with direct measurements of the fitness parameters. In cases where direct measurement of a fitness parameter was not included in a validation dataset, substitutions were selected. Briefly, gait speed was substituted with a composite leg strength measurement and a composite physical functioning score; FEV1 was substituted by forced expiratory volume (FEV) and VO2max; VO2max was substituted by FEV. Details are reported in Supplementary Note 2.

Second, we evaluated if using our DNAm biomarkers improve estimation of fitness parameters beyond variation explained through age and sex (null models) by evaluating the significance of the DNAm biomarker as a predictor. Pearson correlations of null models are presented in Supplementary Table 1. The reported p-values indicate the significance of the DNAm biomarker estimate as a predictor for the fitness parameters. The individual- dataset and fixed-effects meta-analysis p-values are calculated across validation datasets with the most relevant variables available in more than one dataset. Specifically, LBC1921 and LBC1936 were used for DNAmGaitSpeed and DNAmFEV1 meta-analysis p-value calculations. LBC1921, LBC1936, CALERIE, and WHI were used for DNAmGripmax. We did not calculate a meta-analysis p-value for DNAmVO2max because only one validation dataset had VO2max measurement.

DNAmFitAge: biological age estimation

DNAmFitAge development

We constructed DNAmFitAge as an indicator of biological age following the methods proposed by Klemera and Doubal [48]. In brief, the Klemera-Doubal model framework stipulates there exists an underlying trait which is unobserved (biological age) which relates to an observable trait (chronological age) and a set of additional variables. This framework posits biological age is centered on chronological age with additional noise. Weighted least squares is used to estimate the relationship of the additional variables with biological age where the weights are formed from correlations of each variable with chronological age.

DNAmFitAge is constructed separately for males and females using four DNAm variables: three of the DNAm fitness biomarkers: DNAmGripmax, DNAmGaitSpeed, and DNAmVO2max, and DNAmGrimAge, a biomarker of mortality risk [15]. We estimate biological age using the TrueTrait function from the WGCNA R package which carries out the Klemera Doubal method described above. Variable weights indicating each variable’s importance for estimating biological age are presented in Table 2A. Pearson’s correlation among original fitness parameters, DNAm biomarkers, and DNAmFitAge in the large training dataset (FHS + BLSA) are displayed in Supplementary Figure 2. Pearson’s correlation of DNAmFitAge to chronological age in training data are presented in panels A and B of Supplementary Figure 2. Models including DNAmFEV1 as a fifth variable were explored, however no improvement in association to physical activity or age-related outcomes were observed; the parsimonious DNAmFitAge model using a subset of the DNAm fitness biomarkers was therefore chosen.

Finally, we created FitAgeAcceleration, the age-adjusted estimate of DNAmFitAge formed from taking the residuals after regressing DNAmFitAge onto chronological age. As such, FitAgeAcceleration is uncorrelated with chronological age. FitAgeAcceleration provides an estimate of epigenetic age acceleration, i.e., how much older or younger a person’s estimated biological age is from expected chronological age. A positive FitAgeAcceleration means biological age is estimated to be older than chronological age. A negative FitAgeAcceleration means biological age is estimated to be younger than chronological age, which is the preferred outcome for a person.

DNAmFitAge validation

DNAmFitAge validation analysis consisted of three components: correlating DNAmFitAge to chronological age, testing FitAge Acceleration association with physical activity, and testing FitAge Acceleration association to aging-related variables in the validation datasets. First, the modeling framework posits biological age is centered on chronological age, therefore validation datasets should demonstrate good correlation and general centeredness between DNAmFitAge and chronological age. Both properties would indicate DNAmFitAge can quantify age. Second, DNAmFitAge incorporates fitness, therefore FitAgeAcceleration (age adjusted DNAmFitAge) should relate to physical activity and physical functioning. These relationships would indicate DNAmFitAge relates to fitness. Third, DNAmFitAge provides insight to the aging process through a fitness paradigm, therefore FitAgeAcceleration should relate to aging-related phenotypes.

We correlate DNAmFitAge with chronological age for males and females because (1) we cannot directly measure biological age, (2) chronological age is not used when forming DNAmFitAge estimates, and (3) the modeling framework posits biological age is centered on chronological age. In addition, because DNAmFitAge is built in males and females separately, we demonstrate what happens when the model is applied to the opposite sex (i.e., male model in females or female model in males). Median absolute deviation, mean deviation, and Pearson correlation are presented in Supplementary Table 2 and correlation is displayed in Figure 2.

We tested for associations between physical activity or physical functioning in low to intermediate physically fit individuals with FitAgeAcceleration, DNAm fitness parameter biomarkers, and other DNAm biomarkers known to relate to physical health. We restricted our analysis to people of low to intermediate fitness to determine if FitAgeAcceleration is more sensitive to small improvements in fitness compared to other current DNAm biomarkers. In addition, this separation captures low to average physically active individuals in each dataset. In short, LBC1921, LBC1936, and JHS measure physical activity, and WHI and InChianti measure physical functioning. Higher values of any variable indicate more activity or better physical functioning. Other DNAm biomarkers which relate to physical health include DNAmPhenoAge [14], DNAmGrimAge, DNAmPAI-1, and DNAmGDF-15 [15]. See Supplementary Note 2 for a thorough description of physical activity variables and inclusion criteria. We use p-values across models to compare DNAm biomarker performance, however other methods could be used that may be more valuable, like likelihood ratio tests or AIC. We chose not to use other methods because of high collinearity among the DNAm biomarkers and the succinctness of p-values.

We tested DNAmFitAge associations to multiple aging-related variables in validation datasets. Specifically, we conducted regression analysis of physical activity, time-to-death, time-to-coronary-heart-disease (CHD), the count of age-related conditions (arthritis, cataract, cancer, CHD, CHF, emphysema, glaucoma, lipid condition, osteoporosis, and type 2 diabetes), age at menopause, cancer, hypertension, type-2 diabetes, and disease-free status. Time-to-event outcomes were analyzed using Cox regression to estimate hazard ratios (HR); continuous outcomes were analyzed using linear regression to estimate slopes; dichotomous outcomes were analyzed using logistic regression to estimate odds ratios (OR); and ordinal outcomes were analyzed using multinomial regression to estimate OR. Some of our cohorts (InChianti, LBC1921, and LBC1936) involved longitudinal measures. In these cases, linear regression models with person-level random intercepts were implemented in R using the lmer function to adjust for correlation within the same individual. Logistic regression models were estimated using generalized estimating equations with the R function gee. Multinomial models were implemented using R function multinom.

FitAgeAcceleration was also explored for explaining information in time-to-death and number of comorbidities beyond what is captured through other epigenetic clocks. DNAmFitAge is built using DNAmGrimAge, and DNAmGrimAge and other epigenetic clocks are known to be associated with age-related conditions. Therefore, FitAgeAcceleration (the age-adjusted measure) is compared to other epigenetic biomarkers using a Likelihood Ratio Test (LRT) in two nested models stratified by sex; one includes age and one other epigenetic clock, and the other includes age, the other epigenetic clock, and FitAge Acceleration. LRTs and corresponding p-values are presented for validation datasets in Supplementary Table 4 for DNAmGrimAge, DNAmPAI1, DNAmGDF15, DNAmAgeHannum, and DNAmAgeSkinBloodClock. We excluded LRTs for other health related outcomes (like disease free status) because those models are constructed from generalized estimating equations (GEE) which are not based on likelihoods, and likelihoods are necessary to compare LRTs.


We combine results across validation studies using fixed effect models or Stouffer’s meta analysis method using the metafor R function. Fixed effect models use the inverse variance to weight estimates, and Stouffer’s method uses the square root of the sample size to weight estimates. The latter is used when harmonization across cohorts was challenging; such as with physical activity variables, the number of age-related conditions, disease free status, and age at menopause. Forest plots evaluating FitAgeAcceleration hazard ratios or coefficients in models adjusted for age and sex are displayed in Figure 3 and Supplementary Table 3. We perform a test of heterogeneity for coefficients across datasets using Cochran Q test for fixed effect models; p-values are displayed as Het. P.

DNAmFitAge evaluation in body builders

We evaluated whether our DNAm fitness biomarkers and DNAmFitAge were significantly different in an independent study of male body builders and controls. There was a total of 66 male body builders and 149 male controls with similar age distributions (p-value > 0.05). Both groups reported the number of years they regularly trained, the average number of intensity trainings they participated in per week, and 88 total participants reported supplements or drugs they are taking. We analyzed whether the DNAm fitness biomarkers, DNAmFitAge, or FitAge Acceleration were different between male controls and body builders using a Kruskal Wallis test (Table 4).

We evaluate whether the improvement in DNAmFitAge and DNAmVO2max in male body builders can be explained by the dietary supplements taken using a linear regression model with DNAmFitAge or DNAmVO2max as the outcome with age as a covariate and indicator variables for taking the supplement and being a body builder. We adjust for age in the model because age was significantly related to taking certain supplements, therefore if age was not included, the differences observed in DNAmFitAge or DNAmVO2max may actually represent differences in chronological ages between supplement usage groups. Linear model results are presented in Supplementary Table 5. To ensure adequate power, we evaluated supplements and drugs with at least 10 people reporting use across both body builders and controls. Only six supplements met this threshold: multivitamins (n = 19), protein (n = 17), energy (n = 17) (creatine, pre-workout, and energy gels), magnesium (n = 16), vitamin D (n = 14), and omega-3 (n = 12). We also evaluated if these supplements were disproportionately taken by male body builders compared to male controls using Fisher’s Exact test (Supplementary Table 6).

Functional CpG annotation

We provide biological insight to the 627 unique CpG loci used in constructing our DNAm fitness biomarkers by exploring genomic enrichment in the entire human genome, analyzing specific enrichment in chromatin states, and comparing CpG loci and coefficients to other epigenetic clocks. We use the GREAT enrichment analysis software tool for analyzing broad genomic enrichment [49]. GREAT analyzes the genes within and nearby the genomic region covered by the CpGs. To avoid confounding the enrichment analysis by gene size, the GREAT algorithm performs a binomial test (over genomic regions) using a whole genome background. We performed the enrichment based on default settings (Proximal: 5.0 kb upstream, 1.0 kb downstream, plus Distal: up to 1,000 kb) using the hg19 assembly. We report nominal, Bonferroni, and FDR p-values for gene, biological, cellular, and molecular function in Table 5A for the top results, and complete results are presented in Supplementary Table 7.

To annotate the CpGs used to construct the DNAm fitness biomarkers based on chromatin state, we assigned a state for the CpGs based on the detailed universal ChromHMM chromatin state annotation of the human genome in which chromatin structure and their associated characteristics are annotated [34]. This annotation generated 100 distinct states using 1,032 experiments into 16 major categories such as active and weak enhancers (EnhA, EnhW), bivalent states associated with promoters (BivProm), flanking promoter states (PromF), polycomb repressed states associated with H3K27me3 (ReprPC), and states associated with exons and transcription. We used one-sided hypergeometric tests to study both the enrichment (OR >1) and depletion (OR <1) patterns of CpGs across the chromatin states as detailed in [50]. Genomic CpG regions on the 450K array with chromatin state information were used as background (n = 483,090). The genomic regions of DNAm fitness biomarker CpG sites with chromatin state information were used as foreground (n = 626), which only excluded 1 CpG. This yielded one-sided hypergeometric p-values not confounded by the number of CpGs within a gene. We report the chromatin state, number of CpG loci enriched in each state, Odds Ratios, and hypergeometric p-values in Table 5B, and complete results are presented in Supplementary Table 10. Because the underlying chromatin states follow a multinomial distribution, we do not adjust our p-values for multiple comparisons.

Finally, we compared the CpG loci and model coefficients used in construction of the DNAm fitness biomarkers to the CpG loci used in DNAmPhenoAge, DNAmAge (Horvath 2013), DNAmAgeHannum, and DNAmAgeSkinBlood. For CpG loci conserved across other epigenetic clocks, we report the DNAm fitness biomarker coefficients, other clock coefficients, overlap with other clocks, and whether the coefficient direction is the same in Supplementary Table 8. We also compare the overlap of CpG loci used in our models and DNAmGrimAge but omit the comparison of coefficient direction to prevent disclosure of intellectual property. For a full list of coefficients and CpG loci used in DNAm fitness biomarker construction, see our GitHub repository at


DNAm: DNA methylation; FEV: forced expiratory volume; FEV1: forced expiratory volume in one second; VO2max: maximal oxygen uptake; Gripmax: maximum handgrip force; FHS: Framingham Heart Study Offspring Cohort; BLSA: Baltimore Longitudinal Study on Aging; LBC1921: Lothian Birth Cohort 1921; LBC1936: Lothian Birth Cohort 1936; CALERIE: Comprehensive Assessment of Long-term Effects of Reducing Intake of Energy; JHS: Jackson Heart Study; WHI: Women’s Health Initiative; CpG: cytosine-phosphate-guanine; CHD: time-to-coronary-heart-disease; OR: odds ratios; GREAT: Genomic Regions Enrichment of Annotations Tool; FDR: false discovery rate.

Author Contributions

Conceptualization: KMM, ZR, SH, FT; methodology: KMM, SH, ATL, AB; statistical analysis: KMM, ATL, FT; writing and editing: KMM, DWB, SH, ZR, EP, FT, REM, LMR, and others; data contribution: ZR, SH, EP, WB, DWB, LF, REM, MK, DLC, AO, AS, MS, LMR, APR. All authors helped with manuscript preparation and interpreted results.


The authors wish to thank the staff, investigators, and study participants from every study for making this research possible. We would also like to thank Robert (Bobby) Brooke for reviewing and making suggestions to this manuscript. The authors also thank Joanna Rudnicka and Aleksandra Pisarek (Jagiellonian University in Kraków) for their contribution to sample collection and sample and pre-processing; Piotr Kaczka, Małgorzata Michalczyk and Michał Krzysztofik (Academy of Physical Education in Katowice) for their contribution to sample collection; Dagmara Lisman (Pomeranian Medical University in Szczecin), Paulina Pruszkowska-Przybylska and Magdalena Kobus (University of Łódź) for sample, metric and demographic data collection; Bożena Wysocka, Aleksander Masny and Michał Boroń (Central Forensic Laboratory of the Police in Warsaw) for laboratory analyses. The authors would like to thank all sample donors for their contribution to this project.

Conflicts of Interest

S.H. is a founder of the non-profit Epigenetic Clock Development Foundation which plans to license several patents from his employer UC Regents. These patents list SH as inventor. R.E.M has received a speaker fee from Illumina and is an advisor to the Epigenetic Clock Development Foundation. The other authors declare no conflicts of interest.

Ethical Statement and Consent

The authors of this manuscript certify that they comply with the ethical guidelines for authorship and publishing in the Journal of Aging (Albany NY, USA). Consent to conduct this research has been approved by University of California Los Angeles IRB board under IRB#15-001479 for Epigenetic Analyses of Aging, IRB#16-002028 for building DNA methylation based biomarkers, and IRB#18-000315 for Validation of Epigenetic Clocks.


SH, KMM, REM, and ATL supported from 1U01AG060908. ZR, FT, and MJ supported from the National Excellence Program (126823), Scientific Excellence Program, TKP2020-NKA-17. REM: Alzheimer’s Society AS-PG-19b-010. SRC: Sir Henry Dale Fellowship by Wellcome Trust and Royal Society grant 221890/Z/20/Z.

FHS is funded by National Institute of Health (NIH) contract N01-HC-25195 and HHSN268201500001I. The laboratory and analytical work: the Division of Intramural Research, National Heart, Lung, and Blood Institute (NHLBI), NIH. The analytical component was also funded by the Center for Information Technology, NIH.

The Jackson Heart Study (JHS) is supported and conducted in collaboration with Jackson State University (HHSN268201800013I), Tougaloo College (HHSN268201800014I), the Mississippi State Department of Health (HHSN268201800015I) and the University of Mississippi Medical Center (HHSN268201800010I, HHSN268201800011I and HHSN268201800012I) contracts from the National Heart, Lung, and Blood Institute (NHLBI) and the National Institute on Minority Health and Health Disparities (NIMHD). The authors also wish to thank the staffs and participants of the JHS.

The views expressed in this manuscript are those of the authors and do not necessarily represent the views of the National Heart, Lung, and Blood Institute; the National Institutes of Health; or the U.S. Department of Health and Human Services.

WHI program is funded by the NHLBI, and the U.S. Department of Health and Human Services through contracts HHSN268201600018C, HHSN268201600001C, HHSN268201600002C, HHSN268201600003C, and HHSN268201600004C.

The LBC1921 was supported by the UK’s Biotechnology and Biological Sciences Research Council. The LBC1936 is supported by Age UK, the Medical Research Council (G0701120, G1001245, MR/M013111/1, MR/R024065/1), and the University of Edinburgh. Methylation typing was supported by Centre for Cognitive Ageing and Cognitive Epidemiology (Pilot Fund award), Age UK, The Wellcome Trust Institutional Strategic Support Fund, The University of Edinburgh, and The University of Queensland.

CALERIE was conducted in collaboration with Columbia University Mailman School of Public Health with support from National Institute on Aging grants R01AG061378 and R33AG070455.

Research on the Polish Study is financed by the National Centre for Research and Development (NCBR) in Poland within the framework of call 10/2019 related to scientific research and studies for national defense and security [project no. DOB-BIO10/06/2019].


  • 1. Harridge SD, Lazarus NR. Physical Activity, Aging, and Physiological Function. Physiology (Bethesda). 2017; 32:152–61. [PubMed]
  • 2. Agusti A, Faner R. Lung function trajectories in health and disease. Lancet Respir Med. 2019; 7:358–64. [PubMed]
  • 3. Frederiksen H, Hjelmborg J, Mortensen J, McGue M, Vaupel JW, Christensen K. Age trajectories of grip strength: cross-sectional and longitudinal data among 8,342 Danes aged 46 to 102. Ann Epidemiol. 2006; 16:554–62. [PubMed]
  • 4. Rapp D, Scharhag J, Wagenpfeil S, Scholl J. Reference values for peak oxygen uptake: cross-sectional analysis of cycle ergometry-based cardiopulmonary exercise tests of 10 090 adult German volunteers from the Prevention First Registry. BMJ Open. 2018; 8:e018697. [PubMed]
  • 5. van Oostrom SH, Engelfriet PM, Verschuren WMM, Schipper M, Wouters IM, Boezen M, Smit HA, Kerstjens HAM, Picavet HSJ. Aging-related trajectories of lung function in the general population-The Doetinchem Cohort Study. PLoS One. 2018; 13:e0197250. [PubMed]
  • 6. Jackson AS, Sui X, Hébert JR, Church TS, Blair SN. Role of lifestyle and aging on the longitudinal change in cardiorespiratory fitness. Arch Intern Med. 2009; 169:1781–7. [PubMed]
  • 7. Radak Z, Torma F, Berkes I, Goto S, Mimura T, Posa A, Balogh L, Boldogh I, Suzuki K, Higuchi M, Koltai E. Exercise effects on physiological function during aging. Free Radic Biol Med. 2019; 132:33–41. [PubMed]
  • 8. Ching SM, Chia YC, Lentjes MAH, Luben R, Wareham N, Khaw KT. FEV1 and total Cardiovascular mortality and morbidity over an 18 years follow-up Population-Based Prospective EPIC-NORFOLK Study. BMC Public Health. 2019; 19:501. [PubMed]
  • 9. Rezwan FI, Imboden M, Amaral AFS, Wielscher M, Jeong A, Triebner K, Real FG, Jarvelin MR, Jarvis D, Probst-Hensch NM, Holloway JW. Association of adult lung function with accelerated biological aging. Aging (Albany NY). 2020; 12:518–42. [PubMed]
  • 10. Lazarus NR, Lord JM, Harridge SDR. The relationships and interactions between age, exercise and physiological function. J Physiol. 2019; 597:1299–309. [PubMed]
  • 11. Huggett DL, Connelly DM, Overend TJ. Maximal aerobic capacity testing of older adults: a critical review. J Gerontol A Biol Sci Med Sci. 2005; 60:57–66. [PubMed]
  • 12. Horvath S. DNA methylation age of human tissues and cell types. Genome Biol. 2013; 14:R115. [PubMed]
  • 13. Hannum G, Guinney J, Zhao L, Zhang L, Hughes G, Sadda S, Klotzle B, Bibikova M, Fan JB, Gao Y, Deconde R, Chen M, Rajapakse I, et al. Genome-wide methylation profiles reveal quantitative views of human aging rates. Mol Cell. 2013; 49:359–67. [PubMed]
  • 14. Levine ME, Lu AT, Quach A, Chen BH, Assimes TL, Bandinelli S, Hou L, Baccarelli AA, Stewart JD, Li Y, Whitsel EA, Wilson JG, Reiner AP, et al. An epigenetic biomarker of aging for lifespan and healthspan. Aging (Albany NY). 2018; 10:573–91. [PubMed]
  • 15. Lu AT, Quach A, Wilson JG, Reiner AP, Aviv A, Raj K, Hou L, Baccarelli AA, Li Y, Stewart JD, Whitsel EA, Assimes TL, Ferrucci L, Horvath S. DNA methylation GrimAge strongly predicts lifespan and healthspan. Aging (Albany NY). 2019; 11:303–27. [PubMed]
  • 16. Marioni RE, Shah S, McRae AF, Chen BH, Colicino E, Harris SE, Gibson J, Henders AK, Redmond P, Cox SR, Pattie A, Corley J, Murphy L, et al. DNA methylation age of blood predicts all-cause mortality in later life. Genome Biol. 2015; 16:25. [PubMed]
  • 17. Chen BH, Marioni RE, Colicino E, Peters MJ, Ward-Caviness CK, Tsai PC, Roetker NS, Just AC, Demerath EW, Guan W, Bressler J, Fornage M, Studenski S, et al. DNA methylation-based measures of biological age: meta-analysis predicting time to death. Aging (Albany NY). 2016; 8:1844–65. [PubMed]
  • 18. Belsky DW, Caspi A, Corcoran DL, Sugden K, Poulton R, Arseneault L, Baccarelli A, Chamarti K, Gao X, Hannon E, Harrington HL, Houts R, Kothari M, et al. DunedinPACE, a DNA methylation biomarker of the pace of aging. Elife. 2022; 11:e73420. [PubMed]
  • 19. Horvath S, Erhart W, Brosch M, Ammerpohl O, von Schönfels W, Ahrens M, Heits N, Bell JT, Tsai PC, Spector TD, Deloukas P, Siebert R, Sipos B, et al. Obesity accelerates epigenetic aging of human liver. Proc Natl Acad Sci U S A. 2014; 111:15538–43. [PubMed]
  • 20. Soerensen M, Li W, Debrabant B, Nygaard M, Mengel-From J, Frost M, Christensen K, Christiansen L, Tan Q. Epigenome-wide exploratory study of monozygotic twins suggests differentially methylated regions to associate with hand grip strength. Biogerontology. 2019; 20:627–47. [PubMed]
  • 21. Ferrari L, Vicenzi M, Tarantini L, Barretta F, Sironi S, Baccarelli AA, Guazzi M, Bollati V. Effects of Physical Exercise on Endothelial Function and DNA Methylation. Int J Environ Res Public Health. 2019; 16:2530. [PubMed]
  • 22. Spólnicka M, Pośpiech E, Adamczyk JG, Freire-Aradas A, Pepłońska B, Zbieć-Piekarska R, Makowska Ż, Pięta A, Lareu MV, Phillips C, Płoski R, Żekanowski C, Branicki W. Modified aging of elite athletes revealed by analysis of epigenetic age markers. Aging (Albany NY). 2018; 10:241–52. [PubMed]
  • 23. Hughes DC, Ellefsen S, Baar K. Adaptations to Endurance and Strength Training. Cold Spring Harb Perspect Med. 2018; 8:a029769. [PubMed]
  • 24. Studenski S, Perera S, Patel K, Rosano C, Faulkner K, Inzitari M, Brach J, Chandler J, Cawthon P, Connor EB, Nevitt M, Visser M, Kritchevsky S, et al. Gait speed and survival in older adults. JAMA. 2011; 305:50–8. [PubMed]
  • 25. Bohannon RW. Grip Strength: An Indispensable Biomarker For Older Adults. Clin Interv Aging. 2019; 14:1681–91. [PubMed]
  • 26. Clausen JSR, Marott JL, Holtermann A, Gyntelberg F, Jensen MT. Midlife Cardiorespiratory Fitness and the Long-Term Risk of Mortality: 46 Years of Follow-Up. J Am Coll Cardiol. 2018; 72:987–95. [PubMed]
  • 27. Babb TG, Long KA, Rodarte JR. The relationship between maximal expiratory flow and increases of maximal exercise capacity with exercise training. Am J Respir Crit Care Med. 1997; 156:116–21. [PubMed]
  • 28. Rahmania SK, Prabowo T, Tessa P. Correlation between Forced Expiratory Volume One Second and Vital Capacity with VO2 Maximum. Althea Med J. 2016; 3:430–3.
  • 29. Fatemi R, Ghanbarzadeh M. Relationship Between Airway Resistance indices and Maximal Oxygen Uptake in Young Adults. J Hum Kinet. 2009; 22:29–34.
  • 30. Ekelund U, Tarp J, Steene-Johannessen J, Hansen BH, Jefferis B, Fagerland MW, Whincup P, Diaz KM, Hooker SP, Chernofsky A, Larson MG, Spartano N, Vasan RS, et al. Dose-response associations between accelerometry measured physical activity and sedentary time and all cause mortality: systematic review and harmonised meta-analysis. BMJ. 2019; 366:l4570. [PubMed]
  • 31. Vigneron N, Peaper DR, Leonhardt RM, Cresswell P. Functional significance of tapasin membrane association and disulfide linkage to ERp57 in MHC class I presentation. Eur J Immunol. 2009; 39:2371–6. [PubMed]
  • 32. Silverman MN, Deuster PA. Biological mechanisms underlying the role of physical fitness in health and resilience. Interface Focus. 2014; 4:20140040. [PubMed]
  • 33. Munters LA, Loell I, Ossipova E, Raouf J, Dastmalchi M, Lindroos E, Chen YW, Esbjörnsson M, Korotkova M, Alexanderson H, Nagaraju K, Crofford LJ, Jakobsson PJ, Lundberg IE. Endurance Exercise Improves Molecular Pathways of Aerobic Metabolism in Patients With Myositis. Arthritis Rheumatol. 2016; 68:1738–50. [PubMed]
  • 34. Vu H, Ernst J. Universal annotation of the human genome through integration of over a thousand epigenomic datasets. Genome Biol. 2022; 23:9. [PubMed]
  • 35. Margueron R, Reinberg D. The Polycomb complex PRC2 and its mark in life. Nature. 2011; 469:343–9. [PubMed]
  • 36. Lu AT, Fei Z, Haghani A, Robeck TR, Zoller JA, Li CZ, Zhang J, Ablaeva J, Adams DM, Almunia J, Ardehali R, Arneson A, Baker CS, et al. Universal DNA methylation age across mammalian tissues. bioRxiv. 2021; 1–28.
  • 37. Houseman EA, Accomando WP, Koestler DC, Christensen BC, Marsit CJ, Nelson HH, Wiencke JK, Kelsey KT. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinformatics. 2012; 13:86. [PubMed]
  • 38. Dawber TR, Meadors GF, Moore FE Jr. Epidemiological approaches to heart disease: the Framingham Study. Am J Public Health Nations Health. 1951; 41:279–81. [PubMed]
  • 39. Ferrucci L. The Baltimore Longitudinal Study of Aging (BLSA): a 50-year-long journey and plans for the future. J Gerontol A Biol Sci Med Sci. 2008; 63:1416–9. [PubMed]
  • 40. Key TJ, Appleby PN, Allen NE, Reeves GK. Pooling biomarker data from different studies of disease risk, with a focus on endogenous hormones. Cancer Epidemiol Biomarkers Prev. 2010; 19:960–5. [PubMed]
  • 41. Taylor AM, Pattie A, Deary IJ. Cohort Profile Update: The Lothian Birth Cohorts of 1921 and 1936. Int J Epidemiol. 2018; 47:1042–1042r. [PubMed]
  • 42. Stenholm S, Koster A, Valkeinen H, Patel KV, Bandinelli S, Guralnik JM, Ferrucci L. Association of Physical Activity History With Physical Function and Mortality in Old Age. J Gerontol A Biol Sci Med Sci. 2016; 71:496–501. [PubMed]
  • 43. Rickman AD, Williamson DA, Martin CK, Gilhooly CH, Stein RI, Bales CW, Roberts S, Das SK. The CALERIE Study: design and methods of an innovative 25% caloric restriction intervention. Contemp Clin Trials. 2011; 32:874–81. [PubMed]
  • 44. Waziry R, Corcoran DL, Huffman KM, Kobor MS, Kothari M, Kraus VB, Kraus WE, Lin DTS, Pieper CF, Ramaker ME, Bhapkar M, Das SK, Ferrucci L, et al. Effect of Long-Term Caloric Restriction on DNA Methylation Measures of Biological Aging in Healthy Adults: CALERIE™ Trial Analysis. medRxiv. 2021; 1–27.
  • 45. Taylor HA Jr, Wilson JG, Jones DW, Sarpong DF, Srinivasan A, Garrison RJ, Nelson C, Wyatt SB. Toward resolution of cardiovascular health disparities in African Americans: design and methods of the Jackson Heart Study. Ethn Dis. 2005; 15:S6-4-17. [PubMed]
  • 46. Manson JE, Chlebowski RT, Stefanick ML, Aragaki AK, Rossouw JE, Prentice RL, Anderson G, Howard BV, Thomson CA, LaCroix AZ, Wactawski-Wende J, Jackson RD, Limacher M, et al. Menopausal hormone therapy and health outcomes during the intervention and extended poststopping phases of the Women's Health Initiative randomized trials. JAMA. 2013; 310:1353–68. [PubMed]
  • 47. Rankovic G, Mutavdzic V, Toskic D, Preljevic A, Kocic M, Nedin Rankovic G, Damjanovic N. Aerobic capacity as an indicator in different kinds of sports. Bosn J Basic Med Sci. 2010; 10:44–8. [PubMed]
  • 48. Klemera P, Doubal S. A new approach to the concept and computation of biological age. Mech Ageing Dev. 2006; 127:240–8. [PubMed]
  • 49. McLean CY, Bristor D, Hiller M, Clarke SL, Schaar BT, Lowe CB, Wenger AM, Bejerano G. GREAT improves functional interpretation of cis-regulatory regions. Nat Biotechnol. 2010; 28:495–501. [PubMed]
  • 50. Lu AT, Hannon E, Levine ME, Crimmins EM, Lunnon K, Mill J, Geschwind DH, Horvath S. Genetic architecture of epigenetic and neuronal ageing rates in human brain regions. Nat Commun. 2017; 8:15353. [PubMed]