DNA methylation-based measures of biological aging and cognitive decline over 16-years: preliminary longitudinal findings in midlife

DNA methylation-based (DNAm) measures of biological aging associate with increased risk of morbidity and mortality, but their links with cognitive decline are less established. This study examined changes over a 16-year interval in epigenetic clocks (the traditional and principal components [PC]-based Horvath, Hannum, PhenoAge, GrimAge) and pace of aging measures (Dunedin PoAm, Dunedin PACE) in 48 midlife adults enrolled in the longitudinal arm of the Adult Health and Behavior project (56% Female, baseline AgeM = 44.7 years), selected for discrepant cognitive trajectories. Cognitive Decliners (N = 24) were selected based on declines in a composite score derived from neuropsychological tests and matched with participants who did not show any decline, Maintainers (N = 24). Multilevel models with repeated DNAm measures within person tested the main effects of time, group, and group by time interactions. DNAm measures significantly increased over time generally consistent with elapsed time between study visits. There were also group differences: overall, Cognitive Decliners had an older PC-GrimAge and faster pace of aging (Dunedin PoAm, Dunedin PACE) than Cognitive Maintainers. There were no significant group by time interactions, suggesting accelerated epigenetic aging in Decliners remained constant over time. Older PC-GrimAge and faster pace of aging may be particularly sensitive to cognitive decline in midlife.

This preliminary study examined overall levels and changes in traditional and PC-based first-and secondgeneration epigenetic clocks and pace of aging measures in participants selected from a larger prospective cohort to represent extremes of maintained and declining cognitive function (termed Maintainers and Decliners, respectively) between a baseline visit when participants were in midlife and a second visit approximately 16 years later. We hypothesized that overall, cognitive Decliners would be biologically older compared to cognitive Maintainers. We also explored whether cognitive Decliners would show faster biological aging (i.e., steeper increases in DNAm over time) compared to cognitive Maintainers; and whether particular cognitive domains associated more strongly than others with measures of biological aging. We expected that PC-based clocks of enhanced reliability would outperform traditional clocks and that secondgeneration clocks and pace of aging measures trained to predict morbidity, mortality, and multi-system decline would outperform first-generation clocks optimized for age prediction. Notably, we tested several DNAm measures because a comparative analysis approach is recommended to simultaneously evaluate the utility of many DNAm measures and determine which ones are associated with aging outcomes of interest [17].

RESULTS
Neuropsychological tests were administered and biological age was estimated at both time 1 (T1) and time 2 (T2) for 24 people who declined in cognitive function (Decliners) and 24 who maintained cognitive function (Maintainers) from T1 to T2 (mean years between assessments = 15.9, range: 15.4 to 16.9), selected using an extreme groups approach (see Methods). Table 1 summarizes study participant characteristics. Decliners and Maintainers did not significantly differ on chronological age, sex, education, race, body mass index, smoking status, or T1 cognition (a composite score derived from neuropsychological tests for spatial reasoning, working memory, processing speed, executive function, and attention; see Methods). Decliners' cognitive composite decreased from T1 to T2 (T1M = 67.61; T2M = 53.89, p < 0.001) whereas Maintainers' cognitive composite did not change over time (T1M = 66.48; T2M = 67.56, p = .189). The observed cognitive decline was more than a standard deviation decline, a clinically noticeable change in cognitive performance associated with risk for future cognitive impairments. Normative values on several neuropsychological tests were further examined to contextualize changes in the cognitive composite. As the sample performed above average at T1, the Decliners' change can be interpreted as moving from above average to average, whereas the Maintainers remained slightly above average at both time points (see Supplementary Results). All individuals in the Decliner and Maintainer groups denied being diagnosed with dementia. Adjudications were not performed, so clinical determinations regarding mild cognitive impairment (MCI) cannot be made. Table 2 DNAmAA measures were  smaller within each time point, with the exception of  Dunedin PoAm-AA and Dunedin PACE-AA, which  were more strongly correlated with GrimAgeAA (r =  .69-.77) and PC-GrimAgeAA (r = .68-.76), as well as with PhenoAgeAA (r = .46-.59) and PC-PhenoAgeAA (r = .37-.57) at T1 and T2.

Time and group main and interacting effects on DNAm
The traditional and PC-based epigenetic clocks and pace of aging measures significantly increased over time, generally consistent with or underestimating the time elapsed between study visits (Table 3 and  Supplementary Table 1). With respect to group

Exploring specific cognitive components on DNAm
To further explore whether the several components of cognitive functioning associated differentially with PC-GrimAge and pace of aging measures, we conducted secondary analyses using the same adjusted multilevel model predicting T1 and T2 DNAm, but instead of the categorical Group predictor, we tested the continuous scaled version of each cognitive component at T2 to determine which cognition component(s) were significantly associated with DNAm-based measures of biological aging. We focused on T2 cognitive components because this was the time point that differentiated the two groups (see Supplementary Table 3).
Results are depicted in

DISCUSSION
This is the first report to explore changes over time in several of the latest DNAm biological aging measuresincluding traditional and PC-based epigenetic clocks and pace of aging measuresin an age-, race-, sex-, education-, cognition-, and body mass index-matched case control comparison and where cases were selected for having cognitive performance declines on objective neuropsychological tests. There were no group differences in DNAm slopes over time, which may be due to low statistical power, but is in line with the few previous studies that have examined only first-and second-generation epigenetic clocks [6][7][8][9]. However, cognitive decline was related to an overall older PC-GrimAge and a faster pace of aging (Dunedin PoAm and Dunedin PACE) compared to those without cognitive decline over this 16-year time frame. These group differences remained statistically significant when corrected for multiple comparisons at a false discovery rate of .10.
There was no evidence of associations between the firstgeneration epigenetic clocks and cognitive decline. Rather, our findings point to the second-generation clock PC-GrimAge as being more sensitive to cognitive change, which aligns with others who report associations between GrimAge, but not Horvath or AGING Hannum, and worse cognitive performance crosssectionally [19], worse future cognitive performance [8], and cognitive decline from adolescence to age 45 [3] and from age 70 to 79 [20]. Notably, we did not observe associations with (PC)-PhenoAge and cognitive decline, which may be due to limited power, but is also consistent with other reports [3,8]. Although PhenoAge and GrimAge are both second-generation clocks, they differ in how they were trained: PhenoAge was created by identifying CpGs that predict a composite measure of mortality-related blood biomarkers (see Supplementary Materials for biomarker list) and chronological age [14]. Conversely, GrimAge was created by generating DNAm surrogates of morbidityand mortality-related plasma proteins (see Supplementary Materials) and smoking pack-years; then time-to-death was regressed onto these DNAm surrogates, chronological age, and sex to identify the CpGs [12]. The blood-based biomarkers across both epigenetic clocks reflect the functioning of similar physiological systems (e.g., immune, kidney, metabolic), but GrimAge also explicitly includes the effects of smoking, which is an established risk factor for cognitive decline and dementia [21]. In addition, of the first-and second-generation clocks, GrimAge and PC-GrimAge tend to have the highest reliability due to its two-step DNAm calculation [3,15]; thus, this measurement property may also explain why GrimAge tends to outperform other clocks, including PhenoAge. However, these reasons remain speculative and future studies with DNAm data should continue to evaluate and report associations across multiple DNAm measures (including the newest pace of aging measures, below) to facilitate comparison across studies, reconcile inconsistencies, and facilitate their inclusion in future meta-analyses and systematic reviews.
In addition to PC-GrimAge, faster pace of aging was associated with cognitive decline. This report is the first to replicate Belsky and colleagues' [2,3] findings of Dunedin PoAm and Dunedin PACE associating with cognitive decline. Our findings suggest that pace of aging measures, which were developed from Dunedin Study participants aged 26-45, can inform cognitive outcomes in middle-aged and older adults. Pace of aging measures may be particularly sensitive to preclinical cognitive changes because they are indexed by a longitudinal panel of biomarkers across multiple physiological systems, which may more closely reflect the mechanisms of cognitive decline, relative to firstgeneration epigenetic clocks that are optimized for age prediction. Interestingly, the epigenetic clocks that pace of aging was most strongly correlated with at T1 and T2 were GrimAge and PC-GrimAge (Figures 1, 2), suggesting that these DNAm measures may be detecting some shared biological aging signals. A limitation to the current DNAm measures is a lack of mechanistic understanding of their underlying biology. Current work is underway to deconstruct these DNAm composite measures into distinct "modules" that may reflect functionally related biological changes [22]. Each epigenetic clock is comprised of differing proportions of CpGs from a given module; however, in line with our findings, GrimAge and DunedinPoAm share a similar composition of modules and have higher quantities of modules that are stronger predictors of morbidity and mortality, as compared to PhenoAge, Horvath, and Hannum [22]. Continued efforts to examine the underlying mechanisms of DNAm measures will aid our understanding of why certain clocks outperform others in predicting health outcomes, including cognitive health.
All DNAm measures significantly increased over time; however, these estimates of biological aging did not increase between T1 and T2 more steeply in Decliners, compared to Maintainers, as evidenced by the absence of a significant group by time interaction. In other words, DNAm estimates of biological aging were associated with the 16-year change in cognitive functioning, but did not progress more rapidly in Decliners than among Maintainers, which may suggest that Decliners' accelerated profile of epigenetic aging was established prior to the initial assessment. However, we note that we had limited power to detect small and moderate effects (particularly interaction effects); therefore, we cannot confidently infer whether the non-significant group by time interactions are due to truly null effects and/or due to the smaller sample size.
In exploring whether particular cognitive domains may covary with PC-GrimAge and pace of aging measures more strongly than others, executive function showed the most consistent associations, as well as withstanding correction for multiple comparisons. One previous report links older epigenetic age estimated from other clocks, including Horvath's intrinsic and Hannum derived extrinsic epigenetic age acceleration and PhenoAge, but not GrimAge, to poorer executive function in African Americans with HIV and a control group [23]; others report null associations between GrimAge and executive function composites [24,25], and between Dunedin PACE and one test of executive function, Trails B [26]. Therefore, converging evidence for associations between DNAm and specific cognitive domains remains inconclusive. Future studies will benefit from investigating separate cognitive domains (in addition to general composites, which is more commonly done), to shed light on which components of cognition may be more or less affected. AGING The current study focused on neuropsychologicallyassessed cognitive decline, which can indicate future risk for dementia [27]. Indeed, in other studies, DNAm measures predicted MCI and clinical diagnosis of Alzheimer's Disease (e.g., [26,28]). No participants in our sample reported having a dementia diagnosis, but adjudications were not performed, so MCI status could not be assessed. However, descriptively, the group with cognitive performance decrements over time experienced greater than a standard deviation change in their average composite score, an indication they may be at future cognitive risk, with their T2 assessments falling slightly below normative values on several neuropsychological tests (see Supplemental Results). It remains unclear whether these individuals will manifest future cognitive impairments, but this magnitude of decline is considered clinically meaningful [29].
Strengths of this study include the longitudinal design with a relatively long follow-up of 16 years; the comprehensive assessment of cognition across several domains known to decline with age; and the recommended analysis of multiple DNAm measures [17] that allowed for comparisons across traditional and PC-based epigenetic clocks and pace of aging measures. However, this preliminary study had limited power to detect small and moderate effects (particularly interaction effects), although we maximized our ability to detect effects by selecting cognitive groups from the tails or extremes of the distribution of cognitive change. In addition, the cognition composite approach used to identify Cognitive Decliners vs. Maintainers assumed that the neuropsychological tests have the same meaning and factor structure across the 16-year time frame in both groups; our smaller, multi-group sample does not meet sample size recommendations for testing measurement invariance [30,31]. However, using a latent variable approach and testing measurement invariance is an important future direction for cognitive change research, and may yield stronger effects than a composite approach (e.g., [32]). Other limitations include only two time points for longitudinal analysis; limited generalizability in terms of education and race; and DNAm measured in blood but not the brain, although blood-brain global DNAm profiles are highly correlated (r = .86) [33].
In conclusion, these preliminary results suggest PC-GrimAge and DNAm based pace of aging measures (Dunedin PoAm and PACE) associate with 16-year, neuropsychologically-validated cognitive decline in midlife. The results warrant a larger-scale study to better examine longitudinal associations between changes in DNAm measures and changes across multiple cognitive domains. Ultimately, establishing DNAm measures as biomarkers of cognitive function in midlife may offer pre-clinical markers of a molecular aging mechanism that can help identify individuals at increased risk for cognitive impairment and dementia in later life.

Participants
Participants were selected from a longitudinal arm of the Adult Health and Behavior (AHAB)-1 study, which comprises a registry of behavioral and biological measurements for the study of midlife individual differences [34]. AHAB-1 participants were first recruited at 30-54 years of age via mass-mail solicitation from southwestern Pennsylvania and were relatively healthy. Study exclusions at the time of initial recruitment (time 1) were a reported history of atherosclerotic cardiovascular disease, chronic kidney or liver disease, cancer treatment in the preceding year, and major neurological disorders, schizophrenia, or other psychotic illness. Other exclusions included pregnancy and reported use of insulin, glucocorticoid, antiarrhythmic, psychotropic, or prescription weightloss medications. Baseline (T1) assessments occurred between 2001 and 2005 and follow-up (T2) assessments began in 2017 and are ongoing, with additional subjects being added at the time of writing.

Selection of participant groups
Using an extreme groups approach, a subset of AHAB-1 participants was selected for the current study: 24 Cognitive Decliners (i.e., those who showed the most decline in cognition from T1 to T2 based on changes in a cognitive composite score, described below) and 24 matched Cognitive Maintainers (i.e., those who maintained cognitive composite levels from T1 to T2, matched to Decliners on demographics and health). The selection was carried out in the following steps: First, from the 300 available AHAB-1 participants with both T1 and T2 data who were enrolled for follow-up (T2) evaluation between June, 2017 and March, 2020, we excluded those who reported medical conditions having potential cognitive sequelae, as might be associated with Alzheimer's disease, stroke, transient ischemic attack, multiple sclerosis, Parkinson's disease, epilepsy, brain cancer, or brain cyst, and people who endorsed having a head injury, concussion, or spinal cord injury. We also excluded people with diagnosed diabetes or HbA1c greater than or equal to 7%; individuals who reported exposure in the previous 12 months to any of the neurocognitive tests administered here; were missing more than 3 of 10 cognitive measurements used in the present analyses; or for whom we lack a stored T1 blood sample sufficient for DNA extraction and AGING DNAm profiling. These exclusions resulted in 167 remaining participants. From the 167, we selected the 24 most extreme cognitive decliners, identified using the cognitive composite (described below). Next, we identified the 50 most extreme cognitive maintainers, and from those 50, matched on sex, race, T1 age, T1 education, T1 cognitive composite, and T1 body mass index to obtain the matched 24 cognitive maintainers. One-to-one multivariate matching based on Mahalanobis distance was performed using the Match function in R (Matching package) [35]. Matching was performed without replacement and by randomly breaking ties. Groups (Decliners, Maintainers) were identified blind and prior to assessment of DNAm measures.

Procedure
Sociodemographic, cognitive, psychosocial, and instrumented biological measurements were collected over multiple study visits at both T1 and T2. At T1, the neuropsychological tests used in the present analyses were administered at visit 1 and blood was drawn at visit 2. On average, there were 30.85 days between visits 1 and 2 for the sample analyzed (median = 25.5, range: 2 to 98). At T2, the neuropsychological tests used in the present analyses were administered at visits 2 and 3 and blood was drawn at visit 2. On average, there were 26.1 days between visits 2 and 3 for the sample analyzed (median = 16.5, range: 8 to 102). AHAB was approved by the University of Pittsburgh Institutional Review Board, and all participants provided written informed consent.

Demographic and health characteristics
Self-reported sex, race, years of education, and smoking status were assessed. Measures of height and weight were obtained to determine body mass index (in kg/m 2 ).

Cognition
T1 and T2 neuropsychological tests used in the present analyses capture several domains of cognitive function: spatial reasoning, working memory, visuomotor processing speed, executive function, and attention. A cognition composite was used (described below).

Spatial reasoning
The Matrix Reasoning subtest from the Wechsler Abbreviated Scale of Intelligence [36,37] was used to assess spatial perception and reasoning. This test involves viewing an incomplete matrix and selecting the response option that completes the matrix. Higher scores correspond to better spatial reasoning.

Working memory
Working memory was assessed with the Digit Span subtest from the Wechsler Adult Intelligence Scale -III (WAIS-III) [37]. The participant is read sequences of numbers and is asked to recall the numbers in the same order (forward) or in reverse order (backward). Higher scores indicate better working memory.

Visuomotor processing speed
Participants completed the first parts of the Trail Making Test [38] and the Stroop Color-Word Test [39] to assess processing speed. Part A (in seconds) of the Trail Making Test requires participants to draw a line connecting circles numbered from 1 to 25 as quickly as possible. Higher scores correspond to poorer processing speed. The first two parts of the Stroop Color-Word Test require participants to (A) read aloud a list of color names (i.e., red, green, blue) printed in black ink and (B) name the colors of the inks (i.e., "XXXX" written in blue ink) as quickly as possible. Scores are the number of correct responses within a 45-second period, with higher scores indicating better performance.

Executive function
Participants were administered two tests of executive functioning: task switching on Part B of the Trail Making Test [38] and the interference score of the Stroop Color-Word Test [39]. The Trail Making Test Part B requires subjects to draw a line connecting numbered and lettered circles as quickly as possible, alternating between numbers and letters in ascending numerical and alphabetical order (e.g., 1-A-2-B-3-C…, etc.). To derive a measure of executive function relatively independent of psychomotor speed, time to completion of Part B is subtracted from Part A, such that higher scores indicate better performance. Assessing ability to resist cognitive interference, the Stroop Color-Word Test requires subjects to read aloud as quickly as possible from 3 pages of color word lists: pages 1 and 2 provide tests of processing speed, previously described. On Page 3 individuals are asked to report the color of the ink used to print the name of incongruent colors (e.g., "blue" for blue ink used to spell the color name "red"), thus requiring participants to inhibit a prepotent response (color word naming). Scores are the number of correct responses within a 45-second period, with higher scores indicating better performance.

Attention
Digit Vigilance pages 1 and 2 [40] was administered to assess vigilant visual tracking and capacity for sustained attention. This test requires participants to rapidly scan a page of numbers arrayed in rows and to cross out only digits designated as targets as quickly as possible. Time (in seconds) was recorded. Higher scores correspond to lower performance. AGING

Cognition composite
A cognition composite was calculated using raw (not standardized or normed) test scores. First, the Trail Making Test Part A and Digit Vigilance Times were multiplied by (-1) so that higher scores correspond to better performance; then the proportion of maximum scaling approach [41] was applied to the individual subtests. This approach transforms each score to a metric from 0 (minimum observed) to 1 (maximum observed) by first transforming the score range from 0 to the highest observed value and then dividing by the highest observed value. The resulting value between 0 and 1 was multiplied by 100. This approach does not change the multivariate distribution and covariate matrix of the transformed variables and is the recommended approach for longitudinal data [42]. The scaled individual tests (Matrix Reasoning, Digit Span forward and backward, Trail Making Test A and A-B, Stroop word, Stroop color, and Stroop colorword, and Digit Vigilance pages 1 and 2) were averaged together to create a cognition composite using all available data. At T1, no cognition data were missing. At T2, 1 participant was missing the Stroop test and 19 were missing Digit Vigilance pages 1 and 2 and 1 was missing just page 2. Higher composite scores indicate better cognition. Notably, this composite approach assumes that the individual neuropsychological tests have the same meaning and factor structure over time. The composite's multilevel reliability was calculated using coefficient omega (omegaSEM function in the multilevelTools package) and was adequate at both the between-(ω = .80, 95% CI [.62, .98]) and within-person levels (ω = .85, 95% CI [.79, .91]).

Tissue acquisition and processing
Fasting blood was collected by a trained phlebotomist between 8:00am and 10:00am. Whole blood samples were frozen in −80°C until time of DNA extraction and analysis. DNA was extracted using the DNeasy Blood and Tissue Kit (Qiagen) at the UCLA Cousins Center for Psychoneuroimmunology. Purified DNA was concentrated using GeneJET PCR Purification Kit (Thermo Fisher) and suspended in the elution buffer to a minimum of 12.5 ng/ul before plating in a 96-well plate. DNA was quantified using the Quant-iT dsDNA Assay Kit, high sensitivity (Invitrogen).
Consideration for variability across assay chips was addressed by organizing samples from the same individual to be placed together on the same chip but randomly assigned by ID. In addition, samples from Decliners and Maintainers were assured to be evenly distributed within each chip, and position within chip was randomized.

DNA methylation data pre-processing
Bisulfite conversion using the Zymo EZ DNA Methylation Kit (ZymoResearch, Orange, CA, USA) and subsequent hybridization of the Human Methylation 850 K EPIC chip (Illumina, San Diego, CA, USA) and scanning (iScan, Illumina) were performed by the UCLA Neuroscience Genomics Core facilities according to the manufacturer's protocols. DNA methylation image data were processed in R statistical software (version 4.1.1) using the minfi Bioconductor package (version 1.38.0) [43]. We checked for samples with >1% of sites with detection p-values >0.01 (n = 0) and for samples with DNA methylation predicted sex discordant with recorded sex (n = 0). The minfi preprocessNoob function was used to normalize dye bias and apply background correction before obtaining methylation beta-values.

Covariates
Analyses were adjusted for participant age and sex. Additionally, because DNAm profiles may differ between cell subtypes [44] and cell composition changes with age, the percentages of six cell subtypes (CD8 total, CD4 total, NK cells, plasma blasts, monocytes, and granulocytes) were estimated from Horvath's website using the Houseman method [45] (and see [46] for validation) and further controlled for in sensitivity analyses. Some may consider controlling for cell subtypes to be unnecessary adjustment or overadjustment because cell subtypes may contribute to the observed differences in DNAm or be on a mediation pathway linking DNAm to aging outcomes; however, we present results both ways for interested readers.

Data analysis
All analyses were conducted using the traditional and PC-based epigenetic clocks and pace of aging measures.
Further mention of DNAm refers to all measures unless specified.
The DNAm measures were modeled individually in two multilevel models with repeated measures nested within person. Model 1 included the main effect of group (Maintainers, Decliners) and time (T1 and T2) on DNAm. Model 2 included the interaction between group and time to explore group differences in change in DNAm over time. All models controlled for baseline chronological age (grand mean centered at 44.79 years) and sex (0 = male, 1 = female, as a factor variable). Notably, because these statistical models control for level 2 (time-invariant) chronological age and include level 1 (time-varying) time as a predictor, our findings can be considered in terms of "age acceleration", which in cross-sectional studies is achieved by controlling for chronological age or outputting residuals from DNAm age regressed on chronological age. Sensitivity analyses further controlled for the percentages of six cell subtypes (CD8 T cells, CD4 T cells, NK cells, plasma blasts, monocytes, and granulocytes), treated as time-varying covariates.
Statistical analyses were conducted in R version 4.1.1 using the nlme package (version 3.1.152). The variancecovariance structure was modeled as a random intercept in all models. Gamma weights (γ), analogous to unstandardized beta weights (i.e., a 1-unit change in the predictor [Decliner vs. Maintainer, or T1 vs. T2] is associated with γ-year change in the outcome), are reported with their 95% confidence intervals (CIs) in tables. We adjusted for multiple comparisons using the Benjamini-Hochberg (BH) correction (using the p.adjust function in R) [18]. To examine different levels of stringency, false discovery rates (FDRs) of .05 and .10 were calculated and chosen to ensure no true discoveries were missed while balancing the number of false positives. FDRs can be interpreted as the expected proportion of false positives among all statistically significant tests.

Power considerations
We selected 24 participants per group to balance funding constraints with generating preliminary data. Although we maximized our ability to detect effects by selecting cognitive groups from extremes of the distribution of change in cognitive performance, the smaller sample size affects our power nonetheless. There is no conventional method for computing power in a multilevel model; however, for a parallel two-group independent t-test with 24 participants per group and alpha set to .05, power of 0.80 can detect approximately Cohen's d = 0.82 (see power curve plotted in Supplementary Figure 1). Therefore, the current study was powered to detect large effects for comparing DNAm measures between groups; we had low statistical power to explore group by time interactions on DNAm measures.

Overview of DNAm clocks
The DNAm clock measures were developed using supervised machine learning techniques to derive algorithms that capture DNAm patterns that predict a dependent variable of interest, or a surrogate of "biological age". The dependent variables differ across the different types of clocks.

First-generation clocks
The first-generation clocks were trained to predict chronological age.
Hannum et al. [1] developed an epigenetic clock (71 CpGs) using whole blood samples from 656 individuals (426 Caucasian and 120 Hispanic) aged 19 to 101. The Hannum clock used in the current study does not include cell distribution data. However, for completeness, there is a version of the Hannum clock known as extrinsic epigenetic age acceleration (EEAA) that is a weighted average of Hannum's estimate with naïve and exhausted CD8 T cells and plasma blasts and adjusted for chronological age [2].
Horvath [3] developed a multi-tissue epigenetic clock (353 CpGs) from 8,000 samples (82 different datasets) representing people across the lifespan. The Horvath clock used in the current study does not include cell distribution data; there is a version of the Horvath clock defined as the residual resulting from regressing Horvath's DNAm age on chronological age and 7 blood cell types (naïve and exhausted CD8 T cells, plasma blasts, CD4 T cells, NK cells, monocytes, and granulocytes) and is known as intrinsic epigenetic age acceleration (IEAA) [4].

Second-generation clocks
The second-generation clocks were optimized for lifespan prediction. Levine et al. [5] proposed the "PhenoAge" clock, which was developed in two steps. First, using data from the National Health and Nutrition Examination Survey (9,926 people ages 20 and over), they developed a measure of "phenotypic age" by selecting from 42 blood-based clinical markers those that predicted mortality. Based on this analysis, 9 blood-based clinical markers (see table below) and chronological age were selected and combined into a phenotypic age estimate and validated in a new sample to predict allcause mortality. In the second step, data from 465 participants aged 21-100 years in the Invecchiare in Chianti (InCHIANTI) study were used to regress phenotypic age on CpG sites. From this, the PhenoAge clock (513 CpGs) was developed, which strongly relates to all-cause mortality and aging-related morbidity [5].

Principal components (PC)-based clocks
Traditional epigenetic clocks use individual CpG sites as inputs to the epigenetic age algorithms, but individual CpGs are unreliable and noisy [9]. Therefore, Higgins-Chen et al. proposed [10] that principal components analysis (PCA) can be used to enhance the reliability of traditional epigenetic clocks by extracting shared systematic variation across CpG sites (principal components, PCs) and feeding those PCs into the elastic net regressions to predict chronological age or other health phenotype. Higgins-Chen et al. provides R code that has users project their own DNAm data onto the original PCA space, which then allows PC-based clock outcomes to be estimated from new data. PC-based clocks show agreement between technical replicates (the same sample measured twice) within 0 to 1.5 years and more stable trajectories in longitudinal studies [10]. PC-based clocks have been used in other published studies (e.g., [11]).