Research Paper Volume 9, Issue 3 pp 1055—1068

An epigenetic aging clock for dogs and wolves

Michael J. Thompson1, *, , Bridgett vonHoldt2, *, , Steve Horvath3, *, , Matteo Pellegrini1, *, ,

* Joint first or last authors

Received: February 6, 2016       Accepted: March 18, 2017       Published: March 28, 2017      

https://doi.org/10.18632/aging.101211

Copyright: © 2017 Thompson et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Abstract

Several articles describe highly accurate age estimation methods based on human DNA-methylation data. It is not yet known whether similar epigenetic aging clocks can be developed based on blood methylation data from canids. Using Reduced Representation Bisulfite Sequencing, we assessed blood DNA-methylation data from 46 domesticated dogs (Canis familiaris) and 62 wild gray wolves (C. lupus). By regressing chronological dog age on the resulting CpGs, we defined highly accurate multivariate age estimators for dogs (based on 41 CpGs), wolves (67 CpGs), and both combined (115 CpGs). Age related DNA methylation changes in canids implicate similar gene ontology categories as those observed in humans suggesting an evolutionarily conserved mechanism underlying age-related DNA methylation in mammals.

Introduction

Technological breakthroughs surrounding genomic platforms have led to major insights about age related DNA methylation changes in humans [19]. In mammals, DNA methylation represents a form of genome modification that regulates gene expression by serving as a maintainable mark whose absence marks promoters and enhancers. During development, germline DNA methylation is erased but is established anew at the time of implantation [10]. Abnormal methylation changes that occur because of aging contribute to the functional decline of adult stem cells [1113]. Even small changes of the epigenetic landscape can lead to robustly altered expression patterns, either directly by loss of regulatory control or indirectly, via additive effects, ultimately leading to transcriptional changes of the stem cells [14].

Several studies describe highly accurate age estimation methods based on combining the DNA methylation levels of multiple CpG dinucleotide markers [1518]. We recently developed a multi-tissue epigenetic age estimation method (known as the epigenetic clock) that combines the DNA methylation levels of 353 epigenetic markers known as CpGs [17]. The weighted average of these 353 epigenetic markers gives rise to an estimate of tissue age (in units of years), which is referred to as "DNA methylation age" or as "epigenetic age". DNA methylation age is highly correlated (r=0.96) with chronological age across the entire lifespan [8,19,20]. We and others have shown that the human epigenetic clock relates to biological age (as opposed to simply being a correlate of chronological age), e.g. the DNA methylation age of blood is predictive of all-cause mortality even after adjusting for a variety of known risk factors [2125]. Epigenetic age acceleration (i.e. the difference between epigenetic and chronological age) is associated with lung cancer [26], cognitive and physical functioning [27], Alzheimer's disease [28], centenarian status [25,29], Down syndrome [30], HIV infection [31], Huntington's disease [32], obesity [33], menopause [34], osteoarthritis [35], and Parkinson's disease [36]. Moreover, we have demonstrated the human epigenetic clock applies without change to chimpanzees [17] but it no longer applies to other animals due to lack of sequence conservation.

Many research questions and preclinical studies of anti-aging interventions will benefit from analogous epigenetic clocks in animals. To this end we sought to develop an accurate epigenetic clock for dogs and wolves. Dogs are increasingly recognized as a valuable model for aging studies [37,38]. Dogs are an attractive model in aging research because their lifespan (around 12 years) is intermediate between that of mice (2 years) and humans (80 years), thus serving as a more realistic model for human aging than most rodents. Dogs have already been adopted to model multiple human diseases in gene mapping studies (e.g. squamous cell carcinoma [39], bladder cancer [40]) and cancers are often the cause of age-related mortality in domestic dogs [41].

The maximum lifespan of dogs is known to correlate with the size of their breed [4244]. Based on previous studies in human [17], we expect that the age acceleration (difference between epigenetic age and chronological age) correlates with longevity. We hypothesize that dogs whose epigenetic age is larger than their chronological age are aging more quickly, while those with negative value are aging more slowly. Thus, we would expect to see a correlation between age acceleration and dog breed size.

We also sought to build an epigenetic clock for gray wolves because alternative age estimation methods have limitations. Gray wolf age estimates have traditionally been conducted through tooth wear patterns, cranial suture fusions, closure of the pulp cavity, and cementum annuli [45,46]. Based on tooth wear patterns, the age structure of a wolf pack is typically skewed towards younger animals (<1-4 years old), with few individuals >5 years of age [46] [47]. Sexually maturity is reached between 10 months and 2 years of age [48,49]. In a wild social carnivore, group living often results in high mortality rates. Gray wolves live on average 6-8 years in natural populations, but can live up to 13+ years in captivity with increased reproductive success [45,46].

Results

Data set

We used Reduced Representation Bisulfite Sequencing to generate DNA methylation data of 46 domestic dogs (26 females, 20 males) and 62 gray wolves from Yellowstone National Park (26 females, 36 males). The age distribution of wolves is skewed towards younger animals (Dogs: mean=5 years, median=4, range=0.5-14; Wolves: mean=2.7, median=2, range=0.5-8) due to younger mortality rates in natural populations compared to domestic species, and that estimating the age in wild specimens lacks precision. Additionally, we included 729 humans (388 females, 341 males) with a large age range (mean=47.4, range=14-94).

Based on calculations and criteria described in the Methods section, we constructed a matrix of high confidence methylation levels across 108 canid blood samples. Previous work has shown that there are locus-specific significant methylation differences between dogs and wolves [50]. Here, however, we sought to identify a clock that correlated with age across both canid species; thus, we removed the methylation sites that showed species-specific divergence. This yielded a set of 252,240 CpG sites for our modeling efforts. Of these, 105,521 could be mapped to syntenic CpGs in the human genome (hg19) for functional annotation purposes. Further, a subset of 9,017 sites are measured by the human Illumina 405K array, which allowed us to test for conservation of age correlations between these evolutionarily divergent species (humans, dogs, and wolves).

From these input sets of 10s to 100s of thousands of CpGs, regression models were obtained using an algorithm (see Methods) that selects a much smaller number of CpGs by allowing regression coefficients to go to zero. As the space of possible models is combinatorially vast, there is no guarantee of global optimality of the resulting models, and there are likely a large number of models that would yield comparable results. Thus, we make no assertions of biological significance for the exact identity or number of CpGs in a given model used here.

Conservation of age-correlated methylation between dogs and wolves

To initially gauge whether it might be possible to create a DNAm age clock for a multi-species group (i.e. canids), we looked at the conservation of age-correlated methylation in the two canid species. The global correlation between the age effects across the two species is small in magnitude (r=0.07, Fig. 1A) which could be due to the following reasons: i) it could reflect poor accuracy of the chronological age estimate in wolves, ii) it could reflect the relatively small sample size, iii) it could reflect that wolves tended to be younger than dogs in our study, i.e. the chronological age distributions differed.

Conservation of epigenetic aging. Normalized correlation (z) between age and DNA methylation for CpG sites in one species versus the same correlation computed at syntenic CpG sites in another species. The species comparisons are shown, as follows: (A) Wolves versus Dogs, (B) Human versus Canid (pooled dogs and wolves), (C) Human versus Dogs, and (D) Human versus Wolves.

Figure 1. Conservation of epigenetic aging. Normalized correlation (z) between age and DNA methylation for CpG sites in one species versus the same correlation computed at syntenic CpG sites in another species. The species comparisons are shown, as follows: (A) Wolves versus Dogs, (B) Human versus Canid (pooled dogs and wolves), (C) Human versus Dogs, and (D) Human versus Wolves.

Conservation of age-correlated methylation between canid species and human

To test for more distant evolutionary conservation of age effects on DNA methylation between canids and humans, we computed age correlations over a set of 729 human blood methylation array samples [6] and examined syntenic locations between the canine (canFam3) and human (hg19) genomes as described in Methods. While the subset of measured DNA-methylation sites common to all 3 species is relatively small (~9000 CpGs), we see that the conservation of age-correlation between “canids” (pooled samples of dogs and wolves) and human is statistically significant, though small in magnitude (r=0.20, p=1×10-81, Fig. 1B). This conservation holds for dogs alone (r=0.20, p=6×10-85) but is weaker for wolves alone (r=0.11, p=1×10-25, Fig. 1C, 1D).

The high correlation between dogs and humans is remarkable because the two data sets were generated on different platforms (RRBS versus the Illumina 450K array).

Leave one out estimate of the accuracy of the canid epigenetic clock

DNAm age (also referred to as epigenetic age) was calculated for each sample by regressing an elastic net on the methylation profiles of all other samples and predicting the age of the sample of interest. In the course of our work, we found that pre-selecting subsets of CpGs was helpful and computationally expedient. This was done by computing correlations between methylation and age and taking only those with absolute correlation above 0.3. These pre-selection steps were also performed in a leave-one-out manner for all cross-validated results presented here. These predictions (in years) were obtained by taking the exponential of the output of the epigenetic aging model where ages were log-transformed prior to regression. We see a strong linear relationship between DNAm age and true age for our 108 canid samples (Fig. 2A). The correlation between predicted and actual ages using leave-one-out cross-validation was 0.8 and the median absolute error was 0.8 (years). The average number of CpGs in the 108 individual regression models was 122.3.

Accuracy of canid age clock. DNA methylation age (y-axis) versus chronological age (x-axis) for all canid samples (green = dog, blue = wolf). (A) Results obtained using a leave-one-out cross validation over all 108 samples. (B) Results obtained in each species separately using a leave-one-out cross validation. (C) Results obtained by regressing on all samples in one species and predicting age on samples from the other species. (D) Final models for each grouping of samples.

Figure 2. Accuracy of canid age clock. DNA methylation age (y-axis) versus chronological age (x-axis) for all canid samples (green = dog, blue = wolf). (A) Results obtained using a leave-one-out cross validation over all 108 samples. (B) Results obtained in each species separately using a leave-one-out cross validation. (C) Results obtained by regressing on all samples in one species and predicting age on samples from the other species. (D) Final models for each grouping of samples.

To examine the effects of pooling two species of canids, we performed the same prediction (DNAm age calculation) procedure on dogs and wolves, separately. We find that the performance of these models is lower than the canid model, with dogs showing a correlation of r=0.65 and wolves r=0.54 (Fig. 2B). The average number of CpGs in the dog-only and wolf-only models were 58.5 and 62.9, respectively. These models, on average, contain fewer CpGs than the combined canid models as the smaller number of samples in each subset provides less statistical support for the regression algorithm.

As another means of assessing the robustness of a multi-species clock, we built one clock for each species using all samples in that species and then applied it to all samples in the other species. These clocks have similar correlation to the dog only or wolf only clocks, close to 0.6, utilizing a single regression model with 67 and 41 CpGs for the dog and wolf model, respectively (Fig. 2C).

Final epigenetic aging clocks based on all animals

To determine the accuracy of our final models, we regressed the penalized elastic net over the set of dogs (41 CpGs), wolves (67 CpGs), and then both combined (115 CpGs) (Fig. 2D). The penalized regression routine (“elastic net”) utilizes an internal cross-validation to select the optimal penalty parameter. While the entire set of canids, and the subset of domesticated dogs could be fit exactly (r=1.0), the wolf data alone was slightly less amenable.

Age acceleration as a function of dog size

With the largest variation in size among terrestrial vertebrates, the domestic dog not only spends most of its life in an environment and lifestyle like its human companions, but also displays a high similarity of analogues to human disease [51,52]. Though dog breeds are diverse in nearly every aspect, smaller breeds are known to live longer than larger breeds [4244]. Recent genomic surveys have identified nine loci linked to canine size determination, with seven of these loci supporting growth, cellular proliferation, and metabolism [53]. Of these, the growth hormone IGF1 has not only been of historic interest as a causative locus controlling body size in mice [5456], but also has the most significant association with body size [57,58].

We found a correlation of 0.25 between age acceleration and breed weight (Fig. 3). Given the limited sample size for dogs (n = 46) we did not reach a significance below the standard threshold of 0.05. However, we expect that a study with a larger cohort might have sufficient power to show that these trends are in fact significant.

Age acceleration and dog breed. Age acceleration (difference between predicted epigenetic age and actual chronological age) is plotted against the maximum weight for the breed of each dog sample.

Figure 3. Age acceleration and dog breed. Age acceleration (difference between predicted epigenetic age and actual chronological age) is plotted against the maximum weight for the breed of each dog sample.

Functional significance of DNAm age sites

As described in Methods, mapping of canid CpGs to the human genome yielded 105,521 sites. We utilized this entire set as “background” and selected subsets of CpGs based on the statistical significance of their correlation with age as “foreground”. These subsets are not meant to correspond exactly to any of the particular regression models, but to capture the general association of age-related CpGs (from which the regression models are drawn) and biological function inferred via proximity of the CpGs to known genes.

We also partitioned the CpGs into groups with positive (gain of methylation) or negative (loss of methylation) with age, as these two groups have been noted to correspond to separate classes of biomolecular function in previous work [17,59]. As negatively correlated sites generally partition to distal parts of gene bodies or inter-genic regions, they tend to have limited annotation. Conversely, positively correlated sites localize to promoter regions of genes for which there is generally more detailed annotation. To ensure the selection of statistically significant age-related CpGs, we performed a multiple-testing correction [60] on the p-values and selected only those with adjusted values <= 0.05. The annotation tool (GREAT) accesses a large and diverse number of databases and function ontologies. Here, we report those results edited down to non-redundant highlights. We found that a subset of 91 negatively-correlated CpGs (0.1% of total) localized to 125 genes that function in cellular organization and the Notch pathway, an evolutionarily conserved cell-to-cell signaling pathway important for cell proliferation and differentiation (Table 1A). The subset of 90 positively-correlated CpGs (0.1% of total) localized to 71 genes with vital roles in embryonic organismal development and chromatin states (Table 1B). In summary, the canid genes whose DNA-methylation changes are most strongly correlated to age (both negatively and positively) are critical developmental genes; those that determine cell fate and organ development in the embryonic stage of life, as has been noted in previous work with DNA-methylation in humans [17,59].

Table 1. Functional enrichment studies of age related CpGs in canids.

Functional AnnotationHypergeometric FDR Q-Value
A. Functional roles of CpGs that lose methylation with age
compartment pattern specification2.8x10-4
proximal tubule development4.3x10-4
carbohydrate derivative transport8.8x10-4
Notch signaling pathway1.3x10-3
B. Functional roles of CpGs that gain methylation with age
regulation of transcription, DNA-dependent1.6x10-11
regulation of RNA biosynthetic process8.8x10-12
organ development1.0x10-11
embryonic organ morphogenesis1.8x10-10
anatomical structure development8.3x10-10
Set 'Suz12 targets': genes identified by ChIP on chip as targets of the Polycomb protein SUZ12 in human embryonic stem cells.2.6x10-10
Genes with high-CpG-density promoters (HCP) bearing the H3K27 tri-methylation (H3K27me3) mark in brain.6.9x10-9
Genes with high-CpG-density promoters (HCP) bearing histone H3 trimethylation mark at K27 (H3K27me3) in neural progenitor cells (NPC).1.5x10-8

Discussion

More broadly, our study demonstrates that DNA-methylation correlates with age in dogs and wolves as it does in human and related species. This age-dependence of DNA-methylation is conserved at syntenic sites in the respective genomes of these canid species as well for more distantly related mammalian genomes such as human. Strikingly, the age associations of syntenic CpGs is well conserved (r=0.20) even though the data were generated on different platforms (RRBS vs Illumina methylation array). Overall, our study demonstrates that dogs age in a similar fashion to humans when it comes to DNA methylation changes.

Race/ethnicity and sex have a significant effect on the epigenetic age of blood in humans [61]. Further, genetic loci have been found that affect epigenetic aging rates in humans [62]. It will be interesting to determine whether sex effects can also be observed in dogs and whether genetic background relates to the ticking rate of the canid clock. Based on our preliminary blood samples of 108 canid specimens, including both dogs and wolves, we accurately measured the methylation status of several hundred thousand CpGs. We demonstrate that these data can produce highly accurate age estimation methods (epigenetic clocks) for dogs and wolves separately. By first removing sites that were variable between dogs and wolves, we could also establish a highly accurate epigenetic clock for all canids (i.e. dogs and wolves combined). This clock allows us to estimate the age of half the canids to within a year.

Our study has several limitations including the following. First, the sample size was relatively low (n=108). There is no doubt that more accurate clocks could be build based on larger sample sizes. Second, we only focused on blood tissue. Future studies could explore other sources of DNA such as buccal swabs. Third, the chronological ages of the wolves are probably not very accurate since they were estimated by the investigators.

In human studies, we have found that lifestyle factors (e.g. diet) have at best a weak effect on cell-intrinsic epigenetic aging rates measured by the 353 CpG based clock [63]. By contrast, extrinsic measures of epigenetic age acceleration, which also capture age related changes in blood cell composition, relate to lifestyle factors that are known to be protective in humans (e.g. consumption of fish, vegetables, moderate alcohol, and to higher levels of education). Biomarkers of metabolic syndrome were associated with increased DNAm age but we could not detect a protective effect of metformin in this observational study [63]. The presented canid aging clocks open up the possibility of assessing dietary and pharmacological intervention on canid aging. The genome coordinates for the CpGs and corresponding regression coefficients of our final canid age estimator and of our dog age estimator can be found in Table 2 and Table 3, respectively.

Table 2. Multivariate model of canid age.

Canine coordinate (canFam3)CoefMean methCorr(age,meth)Human coordinate (hg19)Proximal genes
Intercept term4.382
chr1: 815007-0.4050.95-0.27chr18: 77637184KCNG2 (+13517), PQLC1 (+74479)
chr1: 487209850.51910.950.31
chr1: 494728580.25940.640.28
chr1: 90590933-0.00410.94-0.32chr9: 1872401Intergenic
chr1: 98573761-0.08370.6-0.42
chr1: 98573781-0.19910.2-0.37
chr1: 101051499-0.27860.79-0.4chr19: 57398441ZIM2 (-46345)
chr1: 1081369200.9330.980.22chr19: 48626542PLA2G4C (-12469), LIG1 (+47317)
chr1: 117122008-0.20910.82-0.33chr19: 36035408TMEM147 (-1088)
chr1: 117495962-0.16640.86-0.41chr19: 35540744FXYD3 (-66421), HPN (+9335)
chr1: 121791246-0.31880.67-0.38chr19: 30153492PLEKHF1 (-2470)
chr1: 1217961390.31340.960.28
chr1: 1218649270.33670.830.3chr19: 30042558POP4 (-52365), VSTM2B (+25153)
chr2: 10101121-0.02660.65-0.29
chr2: 30853505-0.00480.93-0.28chr10: 4714389Intergenic
chr2: 363476520.02480.370.36chr5: 140749805PCDHGA6 (-3845), PCDHGB3 (-25)
chr2: 710808240.02880.860.33chr1: 30051475Intergenic
chr2: 82210243-0.3870.94-0.22chr1: 15602565Intergenic
chr2: 84377388-0.48290.97-0.36chr1: 11951757NPPB (-32770), KIAA2013 (+34722)
chr2: 844450180.09240.260.33chr1: 11864680CLCN6 (-1587), MTHFR (-1379)
chr3: 1128258-0.09930.94-0.37
chr3: 51442070-0.06410.8-0.3chr15: 88733456NTRK3d (+66204)
chr3: 60468935-0.24490.82-0.43chr4: 8834358HMX1 (+39184)
chr3: 628808320.49310.880.25chr4: 17638199MED28 (+21946)
chr3: 84450199-0.06960.89-0.49chr4: 25978965SMIM20 (+63140)
chr4: 28034141-0.63560.14-0.33chr10: 79971431Intergenic
chr4: 28162022-0.11290.89-0.32chr10: 80116134Intergenic
chr4: 284898630.02890.70.24chr10: 80479452Intergenic
chr4: 791532380.10580.090.42chr5: 27038840CDH9 (-148)
chr5: 4750111-0.0890.35-0.5chr11: 129969307ST14 (-60149), APLP2 (+29507)
chr5: 13918996-0.31570.27-0.34chr11: 119933419TRIM29 (+75741)
chr5: 19204758-0.41010.33-0.57chr11: 114000061ZBTB16a,b,c,d (+69747)
chr5: 19204778-0.06910.25-0.37chr11: 114000041ZBTB16a,b,c,d (+69727)
chr5: 32474202-0.00840.09-0.31chr17: 7453106TNFSF12 (+636), TNFSF12-TNFSF13 (+691)
chr5: 329467010.08740.620.61chr17: 8027247ALOXE3b,d (-4883), HES7a,b,c,d (+154)
chr5: 329479260.1720.710.6chr17: 8028384HES7a,b,c,d (-983)
chr5: 488833590.33080.50.53chr1: 61517619NFIAb,d (-29914)
chr5: 57976813-0.17420.8-0.35chr1: 3310290ARHGEF16 (-60699)
chr6: 24353251-0.30360.94-0.31
chr7: 752676-0.57790.9-0.4chr1: 202263534UBE2T (+47573)
chr7: 3691174-0.03770.87-0.39chr1: 199252083Intergenic
chr7: 16651501-0.24480.9-0.46chr1: 183251643LAMC2 (+96221)
chr7: 43318848-0.36650.46-0.27chr1: 153754738SLC27A3 (+6971)
chr7: 79793670-0.09140.72-0.31chr18: 46369695Intergenic
chr8: 49880796-0.00780.62-0.25chr14: 77512098IRF2BPL (-17065), ZDHHC22 (+96035)
chr8: 62064006-0.15010.37-0.49chr14: 91654259GPR68 (+66009), C14orf159 (+73546)
chr8: 68929216-0.04040.91-0.36chr14: 101158024DLK1a,b (-35139)
chr9: 2907494-0.20420.89-0.41
chr9: 20510949-0.24440.2-0.46chr17: 40575423PTRF (+111)
chr9: 248690270.07260.680.53chr17: 46683610HOXB6a,b,c,d (-1257)
chr9: 469931250.88630.950.38
chr9: 588411930.06050.330.43chr9: 126779234LHX2a,b,c,d (+5346)
chr10: 1318284-0.55720.88-0.31chr12: 57582542NXPH4a,b (-28035), LRP1 (+60261)
chr10: 473987720.18210.250.3chr2: 45231989SIX2a,b,c,d (+4579), SIX3a,b,c,d (+63088)
chr10: 55453590-0.32190.95-0.52chr2: 54776879SPTBN1 (+93458)
chr10: 69070602-0.05180.91-0.41chr2: 70910505ADD2 (+84803)
chr11: 54591790-0.16520.78-0.35chr9: 38412041IGFBPL1 (+12402), ALDH1B1 (+19381)
chr11: 568124700.27220.840.49chr9: 102590004NR4A3a,b,c,d (+996)
chr12: 6649423-0.00030.87-0.36chr6: 37591955MDGA1a (+73810)
chr13: 30162882-0.12390.92-0.33chr8: 134871127Intergenic
chr13: 44072687-0.0950.81-0.31chr4: 48239824TEC (+32056)
chr14: 139053550.18450.450.38
chr14: 41413362-0.28290.57-0.33chr7: 28355716CREB5 (-96427)
chr14: 59995975-0.01420.73-0.34chr7: 121776852AASS (-2977)
chr15: 177806471.39880.970.29chr14: 20915434TEP1 (-33855), OSGEP (+7829)
chr15: 17785631-0.38970.94-0.43chr14: 20921454APEX1 (-1899), OSGEP (+1809)
chr16: 131577-0.06290.73-0.31
chr16: 2470190.80110.830.34
chr17: 180338660.28020.820.32chr2: 23704553Intergenic
chr18: 1791242-0.01180.94-0.35chr7: 50515762FIGNL1 (+1659)
chr18: 25850449-0.26070.92-0.34
chr18: 33813035-0.15490.78-0.41chr11: 33962891LMO2 (-49056)
chr18: 43740411-0.16460.94-0.34chr11: 45669463CHST1 (+17708)
chr18: 489057780.07650.720.41chr11: 68925723Intergenic
chr18: 49633631-0.03360.49-0.22chr11: 67984189SUV420H1 (-3308)
chr18: 53920336-0.09570.8-0.4
chr20: 444551980.33380.720.46
chr20: 49366316-0.00040.71-0.28chr19: 12895268HOOK2 (-8932), JUNB (-7041)
chr21: 23088752-0.41090.88-0.28chr11: 75219103GDPD5 (+17844)
chr21: 47917499-0.02350.84-0.29chr11: 27349807Intergenic
chr22: 56299850-0.25090.89-0.32chr13: 108022629Intergenic
chr23: 24782165-0.10490.89-0.33chr3: 18277027Intergenic
chr23: 24782179-0.6580.94-0.31chr3: 18277013Intergenic
chr24: 42551119-0.28920.93-0.35chr20: 56148739PCK1 (+12604), ZBP1 (+46789)
chr24: 455899010.21350.970.27chr20: 59877087CDH4b (+49606)
chr26: 2208590.00370.980.27
chr26: 5991914-0.21540.93-0.37chr12: 124138408TCTN2 (-17251), GTF2H3 (+20033)
chr26: 11457252-0.36790.94-0.28chr12: 114784708TBX5a,b,c,d (+61538)
chr26: 37645878-0.41250.93-0.3
chr27: 1189935-0.04970.78-0.39chr12: 54471815HOXC4a,b,c,d (+24155)
chr27: 2886690-0.00040.86-0.44chr12: 52559286KRT80 (+26497), C12orf44 (+95532)
chr27: 45394279-0.46890.93-0.45
chr28: 23823079-0.29560.91-0.31
chr28: 40564054-0.06030.96-0.33chr10: 134593678NKX6-2a,b,c,d (+5877)
chr30: 15275091-0.05570.89-0.31
chr30: 27934524-0.04840.93-0.26chr15: 63648005CA12 (+26354), APH1B (+78253)
chr30: 386208970.85670.940.33chr15: 78043186Intergenic
chr31: 277206710.03320.160.46chr21: 34444104OLIG1a,d (+1655)
chr31: 36955453-0.52670.91-0.36chr21: 44079991PDE9A (+6127)
chr31: 37492782-0.49420.49-0.56
chr32: 1431916-0.28350.07-0.3chr4: 77752402Intergenic
chr32: 38110814-0.0760.87-0.35
chr33: 22992599-0.00030.93-0.34chr3: 119042586ARHGAP31 (+29367)
chr33: 257835820.38190.980.34chr3: 122422615PARP14 (+23151), HSPBAP1 (+90055)
chr33: 31142995-0.83110.95-0.3chr3: 194291430ATP13A3 (-72338), TMEM44 (+62719)
chr34: 40858941-0.15820.93-0.38chr3: 177096996Intergenic
chr35: 2307155-0.11710.94-0.32chr6: 1886203Intergenic
chr36: 25451930.24030.840.47chr2: 157179898NR4A2 (+9329)
chr36: 199695910.05150.270.38chr2: 177025691HOXD1a,b,c,d (-27615), HOXD4a,b,c,d (+9742)
chr37: 6301-0.10580.69-0.43
chr37: 254546870.45550.320.33chr2: 219736500WNT10Aa,b,c,d (-8584), WNT6a,b,c,d (+11957)
chr38: 162302810.10550.920.23chr1: 221912099DUSP10 (+3418)
chr38: 22365525-0.14510.89-0.36chr1: 159724037CRP (-39659), DUSP23 (-26755)
chr38: 227928770.75990.720.43chr1: 159145579DARC (-29621), CADM3d (+4181)
chrX: 80013740-0.1330.85-0.29
Genome coordinates and coefficient values for predicting a log (base e) transformed version of chronological age. These coefficients were found by regressing a log-transformed version of age on the RRBS DNA-methylation measured from 108 canid blood samples. Since chronological age was log-transformed prior to regression, it is important to exponentiate the age estimate from this model to arrive at age estimates in units of years. We provide the mean methylation and Pearson correlation with age for each individual CpG. Where possible, we identify, via synteny to the human genome, genes that are proximal to the CpGs in our models. Numbers in parentheses are the distance in bases to the Transcription Start Site of the gene. Additionally, we note those genes with experimentally inferred relevance to cellular identity (pluripotency).
Genes experimentally identified as targets of pluripotency factors and the Polycomb repressor complex [69,70]
a genes identified by ChIP on chip as targets of the Polycomb protein EED in human embryonic stem cells.
b genes possessing the trimethylated H3K27 (H3K27me3) mark in their promoters in human embryonic stem cells, as identified by ChIP on chip
c Polycomb Repression Complex 2 (PRC) targets; identified by ChIP on chip on human embryonic stem cells as genes that: possess the trimethylated H3K27 mark in their promoters and are bound by SUZ12 and EED Polycomb proteins
d genes identified by ChIP on chip as targets of the Polycomb protein SUZ12 in human embryonic stem cells

Table 3. Multivariate model of domesticated dog age.

Canine coordinate (canFam3)CoefMean MethCorr(age,meth)Human coordinate (hg19)proximal genes
(Intercept)-6.9009
chr1: 23851533-0.18430.95-0.47chr7: 158164956Intergenic
chr1: 98804509-0.17710.92-0.56chr9: 95371248ECM2 (-72912), IPPK (+61298)
chr2: 34467253-0.17620.96-0.52chr10: 323319Intergenic
chr2: 501657691.84030.970.56chr5: 63460330RNF180 (-1378)
chr3: 541284820.14440.770.51chr15: 85429683PDE8Aa,b (-93987), SLC28A1 (+1771)
chr4: 28320267-0.40380.94-0.43chr10: 80292869Intergenic
chr5: 19204758-0.53930.31-0.62chr11: 114000061ZBTB16a,b,c,d (+69747)
chr5: 329467010.14140.630.67chr17: 8027247ALOXE3b,d (-4883), HES7a,b,c,d (+154)
chr5: 578895440.11040.880.45chr1: 3202081Intergenic
chr6: 31347568-0.02450.64-0.5chr16: 11536754ENSG00000188897 (+80689), RMI2 (+97467)
chr6: 770302510.58070.820.6chr1: 68732333WLSa,d (-34106)
chr7: 540605170.30470.060.48chr18: 33708261ELP2 (-1599), SLC39A6 (+1019)
chr8: 504345460.03510.910.42chr14: 78126349SPTLC2 (-43234), ALKBH1 (+48013)
chr9: 588410670.3180.150.61chr9: 126779366LHX2a,b,c,d (+5478)
chr10: 213559510.38760.860.52
chr10: 55453590-0.41820.95-0.75chr2: 54776879SPTBN1 (+93458)
chr10: 566944810.48540.090.46chr2: 56151248EFEMP1 (+25)
chr10: 628324680.20770.770.6chr2: 63279783OTX1a,b,c,d (+1847)
chr10: 628325120.23470.580.67chr2: 63279827OTX1a,b,c,d (+1891)
chr11: 44228630.72960.910.44chr5: 113831979Intergenic
chr11: 568124700.05790.870.47chr9: 102590004NR4A3a,b,c,d (+996)
chr11: 688121301.2040.950.5chr9: 117441733C9orf91 (+68248)
chr12: 672281921.42840.980.39chr6: 110931985Intergenic
chr12: 67482078-0.22410.19-0.53chr6: 111267687GTF3C6 (-12075), AMD1 (+71715)
chr12: 67482081-0.03140.26-0.48chr6: 111267690GTF3C6 (-12072), AMD1 (+71718)
chr12: 717161090.00650.820.34
chr14: 83246050.30870.650.59chr7: 127670876LRRC4 (+246)
chr14: 343731670.20910.760.64chr7: 20371740ITGB8 (+995)
chr18: 36757482-0.4030.83-0.63chr11: 30565405MPPED2 (+36637)
chr19: 22964392-0.01570.89-0.54chr2: 128408298GPR17 (+4860), LIMS2 (+13821)
chr20: 430106360.40450.970.5
chr20: 444551980.06640.720.47
chr20: 568212331.21990.980.42chr19: 2210771SF3A2a (-25748), DOT1L (+46624)
chr22: 499505530.14230.540.63chr13: 100636088ZIC2a (+2063)
chr24: 37978456-0.09790.77-0.59
chr28: 396444100.08560.840.37
chr30: 386208971.44410.940.51chr15: 78043186Intergenic
chr31: 37492782-0.00280.47-0.51
chr33: 23073877-0.02960.96-0.52chr3: 119134135TMEM39A (+48393)
chr34: 40999085-0.13950.97-0.46chr3: 177284378Intergenic
chr36: 25451420.4320.50.62chr2: 157179848NR4A2 (+9379)
Genome coordinates and coefficient values for predicting a log (base e) transformed version of chronological age. These coefficients were found by regressing a log-transformed version of age on the RRBS DNA-methylation measured from 46 domesticated dog blood samples. Since chronological age was log-transformed prior to regression, it is important to exponentiate the age estimate from this model to arrive at age estimates in units of years. We provide the mean methylation and Pearson correlation with age for each individual CpG. Where possible we identify, via synteny to the human genome, genes that are proximal to the CpGs in our models. Numbers in parentheses are the distance in bases to the Transcription Start Site of the gene. Additionally, we note those genes with experimentally inferred relevance to cellular identity (pluripotency).
Genes experimentally identified as targets of pluripotency factors and the Polycomb repressor complex [69,70]
a genes identified by ChIP on chip as targets of the Polycomb protein EED in human embryonic stem cells.
b genes possessing the trimethylated H3K27 (H3K27me3) mark in their promoters in human embryonic stem cells, as identified by ChIP on chip
c Polycomb Repression Complex 2 (PRC) targets; identified by ChIP on chip on human embryonic stem cells as genes that: possess the trimethylated H3K27 mark in their promoters and are bound by SUZ12 and EED Polycomb proteins
d genes identified by ChIP on chip as targets of the Polycomb protein SUZ12 in human embryonic stem cells

Methods

Reduced representation bisulfite sequencing (RRBS)

We obtained previously published canine RRBS methylation data as CGmap files (see Janowitz, Koch, et al. 2016) [50]. Both wolf and dog data were aligned to the canine genome (canFam3).

Data processing

For each CpG site in each sample we estimated the methylation frequency as the number of methylated mapped read counts over the total mapped read counts and computed a corresponding 95% confidence interval from the binomial distribution [64]. For inclusion in our analysis, we required that each CpG site had confident methylation frequencies in at least 95% of samples. Confidence was defined as having a confidence interval smaller than 0.63 (roughly equivalent to requiring a minimum of 15 mapped reads at that site). For the remaining elements in the data matrix, we used the frequencies calculated regardless of confidence or imputed missing values using R package “softImpute” with type option “ALS” [65].

Culling species-specific differential methylation

To exclude species-specific differential methylation as a confounder, we first constructed a methylation matrix with no dog samples with ages greater than the maximum observed wolf age (8 years). For each CpG we then computed a t-test of the dog methylation values versus the wolf methylation values and excluded those with t >= 2 from use in regression modelling.

Computing age correlations for DNA methylation

When comparing age-correlations computed in datasets of different sizes, we use a z-score instead of the Pearson correlation coefficient. A Student t-test statistic for testing whether a Pearson correlation () is different from zero is given by

where ms denotes the number of observations (i.e. samples) in the s-th data set.

Regression

Penalized regression models were built using glmnet [66]. Given that we would like to see a reduction in the number of predictors from potentially hundreds of thousands of CpGs as input, we utilized the “elastic net” version of glmnet corresponding to an alpha parameter of 0.5. For all results reported here, the internally cross-validated (cv.glmnet) was utilized to automatically select the optimal penalty parameter.

Functional Annotation and multi-species synteny

Canid methylation sites (using coordinates from the CanFam3 draft genome) were first mapped to the human genome (hg19) where possible so that functional analysis tools with access to the most complete and detailed annotations could be utilized. This mapping was made using the "liftOver" tool and associated human to canine chain files available at the UCSC Genome Browser [67]. The human genome coordinates were then used as input to the Genomic Regions Enrichment of Annotations Tool (GREAT) [68].

Conflicts of Interest

The Regents of the University of California is the owner of a provisional patent application directed at this invention for which the authors are named inventors.

Funding

SH, MP, and MT were supported by 1R21AG049400 – 01A1. SH was funded by NIH/NIA U34AG051425-01. MJT acknowledges support from a QCB Collaboratory Postdoctoral Fellowship, and the QCB Collaboratory community directed by Matteo Pellegrini. BvH acknowledges support from National Science Foundation (Grant Numbers: DEB-0613730, DEB-1245373, DMS-1264153), Yellowstone National Park, the Yellowstone Park Foundation, the Intramural Program of the National Human Genome Research Institute, AKC OAK (Grant Number: 1822), and the NIH (Grant Numbers: T32 HG002536, GM053275).

References

View Full Text Download PDF


Copyright © 2025 Rapamycin Press LLC dba Impact Journals
Impact Journals is a registered trademark of Impact Journals, LLC