The goal of our study was to capture the conditional dependence relationship between telomere length (TL) and demographic, lifestyle and dietary factors by using Copula Graphical Models (CGMs) on NHANES 1999-2002.
Even though, by definition, a data-driven exploratory study does not have any hypothesis, the root idea behind our study was that human nutrition could exert a direct influence on leukocyte telomere length (LTL). This idea has been strongly contradicted by the results of the analyses. Indeed, in our analyses only 5 dietary variables out of 50 have been directly related to TL with a certainty ≥ 0.95 after correcting for the effect of all the other variables in the dataset. These variables are the following: caffeine and dietary fibers, related to shorter telomeres; eicosenoic acid, docosapentaenoic acid and sodium, related to longer telomeres. However, these relationships reach the certainty threshold of 0.95 only in specific age strata.
The factors that are mostly linked to TL, except for age, are blood variables. Among these variables there are at least six that could be influenced by diet, and they are: the blood levels of C-reactive protein (CRP), the serum levels of retinyl stearate and γ-tocopherol, that all have a negative connection to TL; the serum levels of retinyl palmitate, folate and vitamin A, that have a positive connection to TL.
Even though only a small part of the links found in the main analyses reaches the certainty threshold throughout the sensitivity analyses, almost all the connections maintain the same direction, suggesting that the results of our main analyses are consistent. Therefore, for the sake of simplicity, we will not highlight further the differences and commonalities between the main analyses and the sensitivity analyses, unless strictly necessary; and when referring to the analyses, we will implicitly refer to the three main analyses.
If not otherwise specified, the trends and connections reported are all taken from the three Table 3.
Interpretation and comparison with literature
We are first going to discuss the dietary variables, then we will look at the blood variables, and finally we will conclude with the sociodemographic and lifestyle variables.
The negative relation we have found between caffeine and TL is present with a certainty ≥ 0.95 in the Young and Middle group, and therefore it is the most consistent among the diet-TL links. This negative relation is not totally unexpected, since Tucker in 2017 [35] performed a linear regression analysis and showed that in NHANES there is a negative correlation between TL and caffeine, but a positive one between TL and coffee consumption. However, a study based on the Nurses’ Health Study and published by Liu et al. in 2016 [36], found that coffee consumption is positively related to the length of telomeres, and caffeine is positive, but not significantly, related to TL after adjusting for coffee consumption. A possible explanation for these differences is a sampling bias in NHANES. In fact, the caffeine intake is marginally related to higher energy intake, higher serum cotinine levels, and active and passive smoking (Supplementary Table 4). These relations suggest that in NHANES 1999-2002 high caffeine consumption goes hand in hand with some poor health choices; and therefore, the negative correlation between caffeine and TL might be caused by the specific sample selected in the study.
We have also found two fatty acids that are positively related to TL: eicosenoic acid and docosapentaenoic acid. However, these two names refer to different types of fatty acids, from ω-3 to ω-11, making it not possible to understand an eventual mechanism of action on leukocyte telomere length. Moreover, this outcome is not consistent in the analyses, and it does not reach the certainty threshold in the sensitivity analysis with the entire study population, which is the one with the most statistical power. So, it seems more likely that this result is an artifact due to the inclusion of 19 different types of fatty acids in our study.
We can apply a similar reasoning to the positive partial correlation between sodium intake and TL found only in Young, because, while the link is positive and above certainty in Young, it is not certain in Middle, and negative, but not certain, in the Old. Therefore, it is quite inconsistent.
The last queer diet-TL relation is the negative correlation between dietary fibers and TL. In this case, however, the link is more consistent than for the fatty acids and the sodium: the negative link we have found in the Young remains negative in the other groups, even if it does not reach the certainty threshold. Contrary to us, Tucker (2018) found that, the higher the consumption of fibers, the longer the telomeres in NHANES [37]. Not only, but there is an abundant number of studies that show the positive effects of a diet rich in fibers on the length of telomeres [20, 25–27, 38]. Due to these previous findings of other researchers, we cannot explain the negative association fibers-TL as an inverse causation.
To elucidate the negative correlation between fibers and TL, it might be important to consider the source of the dietary fiber and the related quantity of food consumed. In Supplementary Table 5, higher consumption of fibers is strongly marginally correlated with higher energy intake. This finding suggests that the high quantity of food may be the main source of fibers of the participants with high fiber intake. Indeed, this hypothesis can explain our results, but it does not clarify the divergence with another research in NHANES. To solve this last problem, we can look at the exclusion criteria of the participants: we put 10000 kcal as upper limit of daily caloric intake, while other studies are much more restrictive [20, 37]. This difference in the exclusion criteria might affect the ultimate result. In fact, high intake of dietary fibers might be tightly connected to overeating and poor health in our population, but not in the populations of studies where the maximum energy intake is lower compared to ours.
Concerning the serum nutrient levels, we can first notice that there seem to be connections between vitamin A-related compounds and TL. However, these connections are mixed: vitamin A is associated with longer telomeres in the Middle and Old group, but it reaches the certainty threshold only in the Middle. Retinyl stearate has a positive partial correlation with TL, but the link is not robust. Finally, retinyl palmitate is linked to shorter telomeres for Young and Middle, but it is certain only for the Young. Therefore, the connection between vitamin A and TL is unclear and might not be present. The silver lining is that there is a lack of studies concerning vitamin A, retinyl stearate, retinyl palmitate and their effects on telomeres, and this is an opportunity for new research.
The serum levels of folate encounter the same issue of consistency that we have found for the dietary intake of fatty acids: the relation is present and certain only for the Middle group, but it disappears almost entirely for the other two groups. So, this association is too weak to be confident that it is real.
With regard to serum levels of vitamin E (α-tocopherol) and γ-tocopherol, the former does not influence the telomeres, while the latter has a robust negative association with TL. It is peculiar to see such a different behavior in two nutrients that are so similar. Nevertheless, this is not totally unexpected, because Tucker (2017) obtained the same results in NHANES [39]. It seems therefore clear that serum vitamin E does not influence LTL, while γ-tocopherol is associated with decreased TL.
Serum γ-tocopherol has been linked to mixed health outcomes, making it not totally comprehensible how it exerts its effects in the human body [40, 41]. As other researchers have proposed [42], we believe that the serum levels of γ-tocopherol might be related to poor health. Indeed, there are complex mechanisms that regulate lipid-soluble vitamins, and the ones underlying γ-tocopherol are not fully understood yet [42, 43]. What we can see from our data is that γ-tocopherol has a positive marginal correlation with BMI, waist circumference, blood levels of total cholesterol and glycated hemoglobin (Supplementary Table 6), all indicators of poor diet and lifestyle. These marginal correlations suggest that γ-tocopherol might be maintained in higher levels in the bloodstream especially in people in poor health conditions.
One last remark to conclude the discussion on serum α- and γ-tocopherol: the absence of link between the α-tocopherol and TL, and the negative link between γ-tocopherol and TL suggest that these two potent antioxidants do not preserve telomere length. Therefore, oxidative stress might not be a factor of telomere shortening in vivo.
Last, and maybe most important, we now consider the negative direct association between blood levels of C-reactive protein (CRP) and TL. C-reactive protein is a well-known marker of inflammation [44], which in turn might be deleterious for the telomeres. Indeed, the direct association CRP-TL suggests that inflammation can accelerate the process of telomere attrition, as previously reported by other authors [29, 30]. Yet, most importantly, it gives a hint on how diet and other factors can influence TL: CRP might mediate the effects that some factors have on TL, as we will show further in this discussion. So, CRP and inflammation might act as mediators between human behaviors and human telomere length.
We were expecting to find some strong connections between telomere length and the race, sex, education level, poverty to income ratio (PIR) of the participants, as other researchers did before us [18, 45, 46]. However, it appears that, after correcting other factors, these associations fade. We have found no connection between race and TL, despite the fact that African Americans have longer telomeres compared to Caucasian in raw data (Table 1) and in literature [45, 46]. So, probably the differences between LTL among races are linked to other factors, like leukocyte composition [31]. We can apply an analogous logic to sex: in our analysis females have longer telomeres than males only in the Old group. Therefore, the results are not robust. In this specific case, however, we believe that this finding supports an idea expressed by Okuda et al. in 2002: the differences in LTL between sexes arises slowly in time due to a lower attrition in females compared to males [3]. This same “slow effect” might be the answer to why we see that PIR has a negative correlation with telomeres only in the Old group, and not in the others.
Conversely, the higher the level of education, the longer the telomeres for the Young and Middle; and the certainty is ≥ 0.95 only in the Middle. Higher education is generally related to higher wealth and a better lifestyle, and this might be the reason why we see this connection in the two younger groups. As for the Old group, since we are looking at a sample of people selected between 1999 and 2002, it is important to keep in mind that only a few of the older participants could have had the privilege of being highly educated; therefore this index might simply lose power on the Old.
We also expected some clear relations between physical activity, BMI, waist circumference, smoking status and TL, since they are present in the literature [19, 22, 23]. But even in this case we were surprised by the absence of links. Of these four factors, only physical activity (PA) has a positive effect on TL; more specifically, only the level of PA assessed in a scale from 1 to 4 has this relation and only in the Young group; therefore, we cannot conclude that being physically active is directly related to TL.
Once again, Tucker in 2017 [22] studied the relationship between TL and PA in NHANES, and he discovered that being more physically active is protective for the telomeres. So, probably this effect is mediated by other factors that were not considered in previous studies. In fact, physical activity shows a negative partial correlation with CRP, and it is also inversely related to BMI and waist circumference with certainty ≥ 0.95, which in turns have a positive partial correlation with CRP with certainty ≥ 0.95. The same positive partial correlation is found for active smoking and CRP, again with a certainty above the threshold.
Therefore, we hypothesize that C-reactive protein plays an important role as a mediator between these lifestyle factors and telomere length, and it should always be considered as a confounder.
Strengths and weaknesses
The first strength of our study is the fact that we conducted the analysis in a data-driven way, therefore almost no assumptions were made a priori. This implies that we could add a relatively large number of variables to the model (p = 102) without doing a strict selection; and so, we included a remarkably higher number of variables in comparison with studies that use other methods on NHANES data [18, 28, 30]. On the other hand, we had to make a loose selection of the variables anyway. We decided to include only the variables that could be related to LTL and that were potentially present for all the participants (so, for example, we excluded pregnancy status). Therefore, even if our study is data-driven, it still has a certain degree of subjectivity.
Another strength is that we could include missing values in our analyses. So, while in other studies a non-trivial number of participants is excluded because of the presence of missing values [22, 24, 28], we included the vast majority of NHANES participants with TL data (7096 out of 7839). On the one hand, this inclusion of more participants increased the statistical power of the analyses; but on the other hand, it might have had a negative impact on some of the outcomes of CGMs, because of the method we used to identify the copula: the nonparanormal shrinkage (npn). Indeed, the npn is not optimal in the presence of missing values; and alternative methods for copula estimation, like gibbs sampling, perform better in such situations [33, 34].
A third strength is that we divided the sample in three age groups, in order to obtain more specific and accurate results. In this way we were able to clearly see the effects of some factors on TL only in specific strata, and we could formulate hypothesis consequently.
Obviously, this study shares the limitation of all the cross-sectional studies that involve LTL assessment via qPCR. The inability to infer causality typical of cross-sectional studies can be delicate in our case, because, of course, you do not expect that people with shorter telomeres will choose a specific type of food only because they have short telomeres; nevertheless, sometimes reverse causation might be possible. Moreover, it is not possible to know the direction of the relations between TL and hemoglobin, total iron binding capacity (TIBC), percentage of basophils and erythrocyte count, and whether these are just characteristics that accompany longer (or shorter) telomeres. As for the qPCR method developed by Cawthon in 2002, it is a ground-breaking instrument to measure the telomere length of a vast number of people, because it is cheap, fast, and it does not need highly-specialized personnel; however, it is not as accurate and precise as the gold standard, which is electrophoresis in agarose gel; and this feature leads to higher uncertainty even in the measure of telomere length, that is the most valuable variable of the present study [47].
Finally, in the past decades researchers discovered that LTL can reflect good health and healthy ageing or poor health and premature ageing; however, already in 1994, Slagboom et al. proved that TL has a high heritability [15]. This implies that, also for future research, it would be better to use leukocyte telomere length in studies that plan a series of measurements over time on the same participants, such as longitudinal studies or clinical trials. In this way it is possible to use the relative telomere attrition as the main outcome.
Conversely, these last shortcomings of our study imply a hidden strength: since a big portion of telomere length is explained by genetics, and a part is explained by the use of qPCR, the few links that we have obtained must be strong enough to overcome the effect of genetics and measurement-related uncertainty.