Research Perspective Volume 15, Issue 17 pp 8537—8551

A Poisson distribution-based general model of cancer rates and a cancer risk-dependent theory of aging

Wenbo Yu1,2,3, , Tessa Gargett1,2,3, , Zhenglong Du4, ,

  • 1 Centre for Cancer Biology, SA Pathology and University of South Australia, Adelaide, SA, Australia
  • 2 Cancer Clinical Trials Unit, Royal Adelaide Hospital, Adelaide, SA, Australia
  • 3 School of Medicine, The University of Adelaide, Adelaide, SA, Australia
  • 4 Department of Molecular and Biomedical Science, School of Biological Sciences, The University of Adelaide, Adelaide, SA, Australia

Received: March 29, 2023       Accepted: August 20, 2023       Published: September 1, 2023
How to Cite

Copyright: © 2023 Yu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.


This article presents a formula for modeling the lifetime incidence of cancer in humans. The formula utilizes a Poisson distribution-based “np” model to predict cancer incidence, with “n” representing the effective number of cell turnover and “p” representing the probability of single-cell transformation. The model accurately predicts the observed incidence of cancer in humans when a reduction in cell turnover due to aging is taken into account. The model also suggests that cancer development is ultimately inevitable. The article proposes a theory of aging based on this concept, called the “np” theory. According to this theory, an organism maintains its order by balancing cellular entropy through continuous proliferation. However, cellular “information entropy” in the form of accumulated DNA mutations increases irreversibly over time, restricting the total number of cells an organism can generate throughout its lifetime. When cell division slows down and fails to compensate for the increased entropy in the system, aging occurs. Essentially, aging is the phenomenon of running out of predetermined cell resources. Different species have evolved separate strategies to utilize their limited cell resources throughout their life cycle.


It has been theorized since the early 1900s that cancer arises from genetic mutations in cells [13]. These pioneering works formed the basis of the modern clonal selection theory, which proposes that cancer develops from a single-cell event triggered by a sequence of mutations that transform normal cells into malignant cells [4]. The rate at which mutations are accumulated is constant throughout the lifespan, which was hypothesized by early theorists and has been further supported by recent evidence [46]. A mathematical model of cancer rates, based on the six powers of “t,” was proposed [7], while several variations of this model have been suggested to apply to general or specific cancers [8, 9].

Given that cancer initiation is basically a discrete event, it may be possible to model cancer incidence using a discrete probability distribution, such as the Poisson distribution. There are two levels of discrete events involved in cancerization. At the first level, cancer arises from a single cellular event among the multicellular host, and this probability of the cancerization of a single cell out of the cell pool can be modeled using the Poisson distribution. At the second level, cumulative mutations are required for cancerization to occur and the probability of the number of mutations needed for cancerization to occur within a single cell can be modeled using the cumulative Poisson distribution function.

This study used a Poisson function model, named as the “np” model, to simulate cancer incidence across the human lifespan. The “n” value represents the effective number of cell turnovers [10]. The “p” value represents the probability of a single cell undergoing transformation. By adjusting the cell turnover number, we trained the model to accurately match the observed data. This finding led to the hypothesis that a reduction in cell turnover has evolved to promote longevity. As a result, the study proposed an “np” theory of aging.

Current theories of aging can be divided into two main categories: the “programmed” theory and the “wear and tear” theory. The programmed theory proposes that the aging of a species is genetically programmed to adapt its lifespan to its life history within the context of evolution [11, 12]. The existence of telomeres provides the best micro-evidence for this theory [13]. On the other hand, the “wear and tear” theory suggests that systems wear out at genetic, cellular, or tissue levels, resulting in aging. There are several sub-theories within this theory, including the somatic mutation theory, which suggests that aging is caused by the gradual accumulation of mutated cells with decreased function [14]. At the non-genetic level, there are various others, including: cross-link theory [15], auto immune theory [16], Glycation theory [17], Oxidative damage theory [18], and molecular inflammatory theory [19]. These theories focus on the micro-mechanism or micro-phenomenon of aging rather than an explanation of the fundamental essence of aging, viz., why aging is inevitable. The “disposable soma theory” of aging attempts to bridge the gap between the two theories above [20]. It suggests that as cells experience increasing wear and tear, the cost of maintaining the organism becomes increasingly expensive. At the same time, selective force is waning after the reproductive stage, resulting in the eventual abandonment of cellular maintenance for the organism.

Although each theory above explains one or more aspects of aging, none of them can fully explain all the phenomena of aging. In this study, the “np” theory of aging postulates that the risk of cancer is the ultimate restriction to an organism’s lifespan and uses this perspective to unite most preceding theories of aging.


A simple model

In a simple model of exponentially growing cell aggregates, the number of cells doubles with each generation “χ”. In each division, there is a probability “p” that each cell may experience a cancerous mutation. The probability of “m” cells simultaneously undergoing cancerization out of all the cells follows a Poisson Distribution (Figure 1A):

A simple model of cancerization. (A) The model of exponentially expanding cell aggregates; (B) Probability of cancerization (y) vs division times(x).

Figure 1. A simple model of cancerization. (A) The model of exponentially expanding cell aggregates; (B) Probability of cancerization (y) vs division times(x).

P(m)=λmm!eλ; λ=np; n=2x.

In the healthy group, m=0. So,



The function will produce an S curve for P(cancer). If we set p=1×10-15, the curve will jump to 1 around the 50th generation of division (Figure 1B). Under this model, every cellular organism will eventually develop cancer. The likelihood of cancer increases as generations proliferate, with a more rapid increase occurring after a certain age.

An adapted model

Cancer incidence cannot be simply modeled using the formula above because multicellular organisms are not simple cell aggregates that proliferate exponentially without limit, and the “p” value of cancerous mutation is more complex than a constant. In this study we hypothesize that mutations accumulated in proliferating cells are the primary contributors to cancer [21]. Hence, in the updated model, “n” equals the cell turnover number during a certain period, which is not constant but rather a function of age “t”, which is corelated to cell generation. “p” is also a function of “t”. The new formula is now expressed as follows:


Based on a recent study, the average daily turnover rate of cells in a standard reference person was 0.33 trillion. Of these cells, 65% were red blood cells that lack a nucleus [10], resulting in a turnover rate of cells with active DNA replication of 0.116 trillion per day. For the purposes of this study, a yearly turnover rate of 42 trillion cells will be used for calculations (Table 1). The parameter “pt” from formula (3) is further split into two terms: pconstant (pc) and paccumulate (pa). “pc” represents the background probability of a single cell becoming cancerous with each division, while “pa” represents the probability of cancerization from a cell that has accumulated mutations over multiple divisions. “pa” is a function with division generations and is determined based on a raining beads model. In this model, cells are envisioned as infinite bowls into which mutations rain down like beads with each replication. Once the number of mutations exceeds a certain threshold in a bowl, the cell becomes cancerous. The number of mutations in each bowl follows a Poisson distribution (Figure 2). The probability of exceeding the threshold “q” is calculated as the cumulative Poisson distribution in formula (4):

Table 1. The calculation of “np” model.

Age groupHypothetic dataIntermediate resultsFinal resultsObserved data
Generation (λPa*)n (turnover/year)p(c)p(a)p=p(c)+p (a)λ=np*eλP(0)P(cancer) /yearP(cancer) /5 years (%)Cancerstats P(cancer)/5 years (%)
0 ~ 5454.2E+132.38E-181.18E-192.50E-180.0001049391.0001050.9998950.0001050.05250.1028
~ 10464.2E+132.38E-184.69E-192.85E-180.0001196851.000120.999880.0001200.05980.0553
~ 15474.2E+132.38E-182.18E-184.56E-180.0001915291.0001920.9998080.0001920.09570.0633
~ 2047.54.2E+132.38E-183.12E-185.50E-180.0002311031.0002310.9997690.0002310.11550.1023
~ 25484.2E+132.38E-186.48E-188.86E-180.000372271.0003720.9996280.0003720.18600.1643
~ 3048.54.2E+132.38E-181.33E-171.57E-170.0006582511.0006580.9993420.0006580.32860.3003
~ 35494.2E+132.38E-182.69E-172.93E-170.0012303571.0012310.998770.0012300.61330.4533
~ 4049.53.15E+132.38E-185.38E-175.62E-170.0017706241.0017720.9982310.0017690.88140.6380
~ 45502.36E+132.38E-181.06E-161.09E-160.0025693921.0025730.9974340.0025661.27650.9550
~ 5050.51.77E+132.38E-182.08E-162.10E-160.0037233291.003730.9962840.0037161.84441.5588
~ 55511.33E+132.38E-184.01E-164.03E-160.0053616391.0053760.9946530.0053472.64522.3953
~ 6051.59.97E+122.38E-187.66E-167.68E-160.007654161.0076840.9923750.0076253.75483.5565
~ 65527.48E+122.38E-181.45E-151.45E-150.0108207781.010880.9892380.0107625.26665.3138
~ 7052.55.61E+122.38E-182.70E-152.70E-150.0151421111.0152570.9849720.0150287.29157.5760
~ 75534.2E+122.38E-184.99E-154.99E-150.0209713261.0211930.9792470.0207539.95469.5098
~ 8053.52.73E+122.38E-189.11E-159.12E-150.0249139111.0252270.9753940.02460611.712311.8208
~ 85541.78E+122.38E-181.65E-141.65E-140.0292973831.0297310.9711280.02887213.626313.0510
~ 9054.51.07E+122.38E-182.95E-142.95E-140.0314837951.0319850.9690070.03099314.565414.2038
~ 95555.33E+112.38E-185.24E-145.24E-140.0279169041.028310.9724690.02753113.028013.3100
Pa was the parameter used in formula (4) to calculate p(a) (also refer to Figure 2). λ=np was used to calculate the final probability. They are different.
Illustration of modelling P(accumulate) by cumulative poisson distribution.

Figure 2. Illustration of modelling P(accumulate) by cumulative poisson distribution.


“λ” represents the mean of accumulated mutations per cell. “q” represents the threshold at which a cell becomes cancerous (q > λ). Multiple studies have suggested that somatic mutations increase linearly over the course of an individual’s life [5, 6]. Thus, it is reasonable to assume that with each round of replication, the number of mutations also increases proportionally, resulting in “pa” increasing as a function of cell division generation or time (Figure 2). From formula (3), a new formula can be derived as follows:


Here we set pa(t) as an internal parameter that does not need to have a specific biological meaning. This internal parameter pa(t) is used to demonstrate that the overall cancer incidence follows the cumulative Poisson distribution.

Fitting the model to observed cancer incidence

We retrieved the data of the average number of New Cases Per Year and Age-Specific Incidence Rates per 100,000 Population in UK (Cancerstats) [22]. We used these data to fit our proposed model formula (5).

n: Since the turnover number “n” was obtained from the reference Man aged between 20–30 years, we will apply “n” to the group up to age 35 (Table 1). We have no data on cell turnover in children. Since “p” is very low in the early stage of life, the impact of “n” is limited. Furthermore, considering the higher metabolism status and smaller body mass of young children, we will keep “n” the same value before age 35.

pc: In the early stages of life, “pa” is insignificant, and we estimated P(cancer) as 0.05% per year based on Cancerstats data. Based on formula (5) (Supplementary Table 2), “pc” can be deduced as 2.38E-18.

pa: As previously discussed, the exact biological meaning of “pa” cannot be provided at this stage. It is an internal parameter that reflects the increasing probability of cancer incidence, based on the assumption that, on average, each generation of cell division will randomly deposit equal amounts of cancer-related mutations following the Poisson distribution [23, 24]. “λPa” represents the mean number of mutations for each cell, and cells will become cancerous when the number of deposited mutations reaches the threshold “q”. To calculate “pa” using formula (4), we set “λ” equal to the cells’ generation (Table 1: λPa) and tried different thresholds “q” until the model best matched real cancer rates (with the highest R2). Ultimately, we set “q = 118”, which means that a cell requires 118 mutations to become cancerous, assuming it receives one mutation from each division (Figure 3A). However, we cannot define “q” as the count of physical mutations since it remains an internal parameter. In our current framework, we propose to define “q=118” as effective mutations, which may be related to driver mutations but encompass more than that, although still fewer than the entire spectrum of somatic mutations since many somatic mutations may not be effective for tumorigenesis.

Comparison of model predicted data with real data of cancer incidence vs age. (A) The real data and predicted data were compared in the year group 0-35, considering different values of “q.” (B) The real data and predicted data were compared across all age groups using q=118, with or without considering cell turnover reduction. (C) The predicted data were plotted under log(probability) vs Log(age). Data points from age group 25 (20-25) to group 75 (70-75) (red dots) exhibit a linear trend with a slope of 5.8. (D) The real data and predicted data of cumulative cancer incidence were compared throughout the entire lifespan, with or without considering cell turnover reduction.

Figure 3. Comparison of model predicted data with real data of cancer incidence vs age. (A) The real data and predicted data were compared in the year group 0-35, considering different values of “q.” (B) The real data and predicted data were compared across all age groups using q=118, with or without considering cell turnover reduction. (C) The predicted data were plotted under log(probability) vs Log(age). Data points from age group 25 (20-25) to group 75 (70-75) (red dots) exhibit a linear trend with a slope of 5.8. (D) The real data and predicted data of cumulative cancer incidence were compared throughout the entire lifespan, with or without considering cell turnover reduction.

Determining the generation of cells at different ages presents a challenge, as cells from various tissues may have different developmental histories. Additionally, differentiated and stem cells may have distinct division cycles. We provided an average estimate of cell generation in different age brackets to assist in building the model and prove that cancer incidence follows our mathematical hypothesis. From the fertilized egg to the newborn infant, cells proliferate exponentially, and the newborn has a total of two trillion cells [25], meaning it has undergone 41 generations of divisions (Supplementary Table 2). For the first five years of life, it requires at least another four generations, and we set λPa = 45 for this age group. For the 5-10 and 10-15 age groups, we set one generation for each stage. Above this age, we set 0.5 generation for each stage until it reached the Hayflick limitation of 55 [26].

By setting n, pc, pa, and using formulas (4) and (5), we can model the five-year cancer incidence (Figure 3A and Table 1). We used q=118 for further analysis.

Final adaption of the model to account for reduced cell turnover

The predicted incidence of cancer exceeded the observed data beyond the age of 35 (Figure 3B). This occurred because Formula (5) cannot always use the same “n.” As people age, cell division and turnover rates decrease [5]. As no real data on cell turnover in aging people are available, we determined the turnover decrease rate by assuming the validity of our model. We found a 25% decrease per five years in the 35-75 age group, a 35% decrease per five years in the 75-85 age group, a 40% decrease per five years in the 85-90 age group, and a 50% decrease per five years in the group aged over 90 years (Table 1: n turnover/year). The model accurately fits the observed data since the reduction was reversely deduced (Figure 3B). Therefore, it is feasible to use a general theory-based model to match cancer incidence. This model authentically reflects the decreased cancer incidence in the very aged group [5].

While we consider reducing cell turnover to fit the overall cancer incidence, it is important to acknowledge that different tissues may exhibit varying “np” values due to differences in cell turnover rates or developmental asymmetries in cell lineage trees [27]. Several studies have reported that cancer rates exhibit exponential growth by six powers of “t” [3, 7]. Fisher and Hollomon’s pioneering study of stomach cancer found that ΔLog(p)/ΔLog(age) has a slope of 5.7 between the ages of 20-75 [2]. It is worth noting that the “np” model, without considering cell turnover reduction, also yielded a straight line with a slope of 5.8 from Group 25(20-25) to Group 75 (70-75), which precisely matches Fisher’s case (Figure 3C and Supplementary Table 2). This implies that stomach tissue may not experience an apparent reduction in cell turnover during this age period.

A theory of aging based on the cancer model

If we convert the cancer incidence shown in Figure 3B into cumulative incidence, we get Figure 3D. From this figure, we can see that reduced cell turnover offers advantage in terms of survival. The model indicates that without cell turnover reduction, humans would reach a 50% cancerization rate at age 66, but with cell turnover reduction, the 50% cancerization rate is delayed by two decades to age 87-89 (Figure 3D). This gives us a hint of the ultimate cause of aging, which is based on the unavoidable increase of cancer risk.

Here, we propose an “np” theory of aging. Cells are highly ordered systems, and to maintain cell fitness (youth), the order needs to be maintained, which can be described as an issue of entropy balance [28]. A cell always gains positive entropy, which needs to be reconciled to defy the second law of thermodynamics. Three levels of entropy are postulated here: (1) metabolic entropy; (2) structural entropy; and (3) information entropy. (1) For any living cell, metabolism is the function that maintains energy/matter intake and output. The entropy at this level is balanced biochemically. (2) With time, the microstructure of the cell or cellular organelles experience “wear and tear”. The generation of new cells through division is the final resort to fix this “wear and tear” and reduce structural entropy. (3) However, irreversible random changes accumulated in the genetic material that cannot be fixed will be passed to the progeny cell, leading to an increase in information entropy. The accumulated information entropy will ultimately succumb to the second law of thermodynamics. The increase in information entropy finally destabilizes the regulation of the cell and leads to unregulated proliferation, resulting in cancer [29, 30]. From another perspective, we can categorize cellular information into two arms: pro-proliferation and pro-regulation. Genetic mutations randomly impact either arm, but only the disruption to the pro-regulation arm will be selected for. With the increase of information entropy, the highly regulated eukaryotic cells will return to a more primitive prokaryotic-like status [29]. This theory of information entropy predicts that any multicellular system will eventually develop cancer. As a result, the total number of cells that can be usefully generated from a single zygote is finite. To minimize the risk of cancer, at the later stage of a species’ lifespan, cell turnover is reduced or stopped. The negative entropy introduced into the cells via division cannot balance the positive entropy produced by the system, leading to increased disorder in cellular structure and metabolism. When this happens, the entropy of the whole system increases, the fitness of the organism decreases, and aging occurs.

This theory of aging predicts the ultimate number of cells a given individual can use is “N”. “N” is restricted by “p”. The predetermined number “N” can be plotted as an enclosed area on the “n” and “t” graph (Figure 4A). For the same reason, the quality of reproductive cells is also restricted by the same law [31]. Hence, all species has a limited period of reproduction. Species will develop different ways to use this cell resource strategically, which forms the basis of an organism’s lifespan and aging process. We list three models of survival strategies for species with three typical lifespans.

“np” theory of aging among different species. (A) Theoretical “nt” plot of model I, II, III species; (B) Post reproduction life vs expected life span of 51 mammal species; (C) The percentage of post reproduction life to whole life: human against the other mammals. T test was used to calculate statistical significance.

Figure 4. “np” theory of aging among different species. (A) Theoretical “nt” plot of model I, II, III species; (B) Post reproduction life vs expected life span of 51 mammal species; (C) The percentage of post reproduction life to whole life: human against the other mammals. T test was used to calculate statistical significance.

Model I: Species with short lifespan and short post-fecundity life. Low fitness is not acceptable for these species. Model I species have a very short half-life of survival in the natural environment, so there is not much evolutionary pressure for longevity. Their natural lifespan is compatible with their survival rate, with “nt” curve has a small area on the plot. The model species are rodents.

Model II: Species with medium to long lifespan and short post-fecundity life. Low fitness is not acceptable. If the species adapt to a strategy where longevity is favored, they are allowed to have more “N”, which enlarges the enclosed area on the “nt” plot (Figure 4A). This process can continue under evolutionary pressure until the advantage of longevity is canceled out by the cancer risk. These species are stronger and have a higher chance of survival for a longer period, so evolution gives them more predetermined cells in their lifespan. However, lifespan is still restricted by the risk of cancer. Eventually, the organism will shut down cell proliferation quickly and no longer compete for survival. The model species are large carnivores.

For Models species I and II, after the reproductive period, the organism undergoes aging, leading to a rapid decline in fitness, which typically results in death in the wild. Their lifespan matches the disposable soma theory [20].

Model III: Species with long lifespan and a long post-fecundity life. Low fitness is acceptable. Few species are extremely favored by longevity, however, a longevity strategy may be evolutionarily favored by the “grandma effect” [3234], where longevity may provide community benefit. We hypothesize that the “N” reaches an evolutionary limit, but the Model III species develop another strategy for using the available “N” by reducing cell turnover at the cost of lower fitness. This type of species has an elongated senescence period among all species. All of them are social and intelligent species, where survival with low independent fitness is possible in the context of a community. This also offers an explanation for the brain weight theory, which found that lifespan was positively related to species’ brain weight [35].

To support this theory, we re-explored the data from Samuel Ellis and Darren P. Croft about the reproductive lifespan and post-reproduction lifespan of 51 mammals [36]. The post-reproduction lifespan vs. total expected lifespan was plotted (Figure 4B). If we divide the species into three groups based on their expected lifespan on the x-axis and two groups based on post-reproduction life on the y-axis, 49 out of 51 species fall into three groups (Supplementary Table 1). These three groups represent aging strategy models I, II, and III, respectively. We note that humans have the highest post-reproductive lifespan and the highest percentage of post-reproductive time (Figure 4C), suggesting that humans have a unique position in evolution and that longevity is highly favored in this species.


This study describes a model of cancer incidence that gives rise to a wider theory of aging. It’s important to note that “p” should not be simply interpreted as the rate of DNA mutation. Instead, it represents the overall likelihood of a cell to escape regulation or suppression and develop into a cancerous colony. The development of cancer is influenced by complex factors, including genetic predisposition, accumulated mutations, self-protective mechanisms like the immune system, and environmental influences. Although growing evidence supports random mutation as the major contributor [37], these factors eventually converge at the genetic level, which is represented as “p” in the proposed model. The aim of this mathematical model is to demonstrate that there is a unifying law behind these diverse factors that drives the average pace of cancerization.

When considering the “np” in different tissues, it is important to view an organism as a developing tree, where the branches may not all develop at the same pace. As mentioned earlier, this model provides an example that matches Fisher’s stomach cancer case [2]. This presents an opportunity to further adapt the model for tissue-specific cancers such as breast or prostate cancer. This model can explain the high incidence of some cancers in children. For example, during early development the nervous system branch undergoes more divisions than other tissues and accumulates a higher “p,” which slows down after adulthood. This model can also apply to explain the increased risk of lymphoma observed in AIDS patients or the positive relationship between chronic inflammation and cancer [38, 39], as these diseases lead to increased cell turnover.

While many studies on cancer origin focus on stem cells, it’s crucial to note that all transit-amplifying cells can potentially transform into cancerous cells by dedifferentiation [40]. Therefore, in this study, we establish the connection of cellular turnover rate and the mutation rate. However, this could not be the whole truth. DNA, being a macromolecule, sustains lesions not only from replication errors but also from environmental factors and spontaneous decay [41]. Consequently, mutations can occur and accumulate in non-dividing cells or terminally differentiated cells over time [42]. If we consider the possibility of cancer originating from non-dividing cells, such as neurons, we can incorporate background parameters into the formulas if we can obtain reliable data.

The objective of this study lies in establishing a simplified model, and we acknowledge that a limitation of our approach is that it does not yet encompass the full complexity of tumorigenesis, as robust quantitative data for these parameters is not yet available. However, these formulas will serve as a platform for future development, and we can incorporate additional factors as coefficients into our original formulas.

Over the last decade, DeGregori et al. developed a theory of cancer development based on the fitness of cancer progenitor cells, which was actually an attempt to apply the disposable soma theory to tumorigenesis [4348]. According to this theory, genetic mutation is not the primary driver of tumor development. Instead, the mutated cells are suppressed by the host until the post-reproduction period, when the host relaxes tumor repression. The theory suggests that normal stem cells have a higher fitness in young tissue environments, which makes it difficult for mutant progenitor cells to compete with healthy stem cells. However, as the system ages, the microenvironment changes, and the healthy stem cell loses its competitive advantage. Mutated cells then gain higher fitness than normal stem cells, leading to tumorigenesis. One problem with the theory is the lack of evidence to support the micro-mechanism. There is evidence to support either a gain or loss of fitness in mutant cells, and there could be many mutations with little phenotypic or fitness change [49]. The disagreement here is obvious: the “np” theory postulates that cancer is the ultimate restrictor of lifespan, and aging is a strategy to avoid cancer, while DeGregori’s theory postulates that aging relaxes the soma regulation thereby allowing cancer development.

There may be ways to resolve this argument. If we can identify the “molecular clock” that regulates a particular tissue, we could slow down the turnover of stem cells in that tissue [50, 51]. For example, if we slow down the stem-cell turnover in mouse breast tissue, based on the “np” theory, we would expect the tissue to display signs of aging but maintain genetic youthfulness, and by promoting aging could delay the onset of breast cancer. However, if DeGregori’s theory is correct, this practice would have no impact or could even promote cancer, since aged tissue relaxes its control of tumorigenesis (Figure 5).

A proposed experiment which can possibly resolve the argument of “np” theory and DeGregori’s theory.

Figure 5. A proposed experiment which can possibly resolve the argument of “np” theory and DeGregori’s theory.

As a metaphor for the “Nuts Poisoned (np)” model, we can imagine a tree of life that produces “Nuts” (fresh cells with low entropy) that support life. A creature feeds on these nuts, which help maintain its fitness. However, some nuts may be poisoned, and over time, more nuts will get poisoned. To increase the chances of survival, the creature must reduce its nut consumption to minimize the risk of poisoning. However, this reduction in nut consumption causes the creature’s fitness to decline, and it begins to age. Eventually, the creature must abandon the tree of life because it has become too poisonous.

We propose that aging is a manifestation of entropy increase. The accumulation of system entropy can be observed as aging [52]. A study of bacterial aging has shown that cells can balance their entropy by proliferating [53]. However, the mechanism of how proliferation can restore negative entropy is not fully understood. Some studies have suggested that division can reduce entropy by altering the cells’ surface-to-volume ratio or through compartmentalization [54, 55]. Our very existence from the first cell on earth demonstrates that cells can renew themselves indefinitely. Information entropy measures the quality of genetic material, which cannot be perfectly maintained forever. Therefore, the ultimate limitation on life is information entropy. The only way to overcome this limitation is through single colony selection, and the process of reproduction is just such a form of single colony selection. Natural elimination of imperfect seeds maintains the stability of information entropy from generation to generation.

Many scientists believe that biological systems have the inherent ability to repair damage and replace defective cells, which suggests that they are not necessarily destined to die [12]. However, the accumulation of genetic mutations is an inevitable process that affects every living organism, leading to mortality. Although stem cell therapies hold promise, they have also been associated with the side effects of tumorigenesis, which can be explained by our theory [56, 57]. Our theory also offers an explanation for Peto’s Paradox, which observes that cancer incidence is not significantly different between small, short-lived animals and large, long-lived animals [58]. The “np” theory states that all species have evolved to adapt their lifespan to their available resources and so balance cellular fitness with the risk of tumorigenesis: hence their cancer incidence should be similar.

Finally, we have further advanced our theory by introducing the concept of the impossible trilemma (Figure 6), which states that it is impossible to have all three of the following system components constant at the same time: (1) structure, (2) information and (3) metabolism. These three phenomena support each other. However, compromising at least one of these aspects becomes inevitable when the other two need to be sustained. These findings provide insights into why interventions of metabolism such as calorie restriction [59], antioxidant supplementation [60], Rapamycin or Sirtuins treatment have demonstrated anti-aging effects in animal models, and why insulin-IGF signaling or the mTORC pathway has been identified as a longevity signature [61]. Examples such as long-lived, cold-blooded animals like turtles or the Greenland shark, which have slower metabolisms, further illustrate the concept of this trilemma [62, 63].

The impossible trilemma in organism. Three phenomena support each other. “Metabolism” provides the material and energy to sustain “structure” and support “information” replication. “Information” guides and directs the “metabolism” and the “structure” of the system. The “structure” provides the framework for the existence of “information” and “metabolism”. Compromising at least one of these aspects becomes inevitable when the other two need to be sustained. For “metabolism” and “structure” to be sustained, the entropy of “information” ultimately increases as a result. For “metabolism” and “information” to be sustained, the system “structure” has to be disrupted. During the process of reproduction, germ cells abandon the soma, much like an escape pod separating from the mothership. For “structure” and “information” to be sustained, metabolism must be compromised.

Figure 6. The impossible trilemma in organism. Three phenomena support each other. “Metabolism” provides the material and energy to sustain “structure” and support “information” replication. “Information” guides and directs the “metabolism” and the “structure” of the system. The “structure” provides the framework for the existence of “information” and “metabolism”. Compromising at least one of these aspects becomes inevitable when the other two need to be sustained. For “metabolism” and “structure” to be sustained, the entropy of “information” ultimately increases as a result. For “metabolism” and “information” to be sustained, the system “structure” has to be disrupted. During the process of reproduction, germ cells abandon the soma, much like an escape pod separating from the mothership. For “structure” and “information” to be sustained, metabolism must be compromised.

Regarding modern anti-aging practices, while they have yielded positive observations in animal models, we hold a pessimistic speculation: as a Model III species, humans have likely approached the upper limit of lifespan, implying that these practices will not extend life beyond the current limit very much [64, 65]. Despite the challenges, there is still hope. If the “np” theory is correct, it could provide new insights into cancer prevention and human longevity. According to the formula, the strategy would be to reduce “p” and “n”. To prevent specific cancers, one approach could be to slow down the stem cell clock in the tissue (low “n”). Alternatively, if low fitness is unacceptable, specific tissues could be replaced with fresh stem cells. Achieving this would require the development of techniques for identifying stem cell colonies ex vivo to ensure that they have the perfect genome (low “p”). Similarly, we could develop anti-aging technologies based on the same principle. However, ethical issues must be carefully considered.

In conclusion, we formulate the first general model for cancer incidence across all lifespans based on Poisson distribution. Our model provides a simple but compelling explanation for the observation that aging is fundamentally entwined with the inherent risk of cancer. We name this new theory of aging as the “Nuts Poisoned” theory, which aims to address gaps in existing aging theories with implications for new avenues of cancer prevention and anti-aging strategies. Currently, this theory is applied only to mammals, but it has the potential to be extended to other vertebrates as well, and we present this model as a foundational framework that can be refined and further developed in the future.

Materials and Methods

Images and graphs

Figures 1A, 2 were plotted by Biorender. Figure 1B was graphed using the Desmos Graphing Calculator (


The Keisan online calculator ( was used to calculate the cumulative value of the Poisson distribution p(a) for Table 1 and the Supplementary Table 2. The coefficient of determination was calculated as R2 = 1- (RSS/TSS). RSS was the sum of squares of residuals, while TSS was the sum of squares of theoretical incidence. RSS=in(Pobserved iPmodel i)2; TSS=in(Pmodel i)2.

The calculation methods for Table 1: In the Poisson distribution calculator, percentile x=118, mean λ=λPa (data from Table 1). p(a) of specific generation was calculated from the difference of neighboured “upper cumulative Q”. For the “0-5” group, put “percentile x” =118 (q), which is the constant threshold. Generation 45 is the “mean λ”. 1.18E-19 is output as upper cumulative Q(45). For next group (5-10), mean λ of 46 is used to get Q(46) = 5.86E-19. The cancerization probability in each age group is (Qn+1 - Qn)/(1 - Qn). Since Q is very small, (Qn+1 - Qn)/(1 - Qn) ≈ Qn+1 – Qn = 5.86E-19 - 1.18E-19 = 4.69E-19, which is the p(a) for “5-10” group, so forth, to calculate p(a) for every group. Put “p(a)”, “p(c)” and “n” into formula (5) to get P(0). P(cancer) /year= 1-P(0). P(cancer)/5 years (%) = [1-(1-P(cancer) /year)5]× 100.

Author Contributions

Wenbo Yu contributed to conceptualization, methodology, investigation, visualization and writing. Tessa Gargett contributed to data validation and manuscript editing. Zhenglong Du contributed to the validation of mathematical methods, manuscript review, and a portion of the discussion.

Conflicts of Interest

Author declares that they have no conflicts of interest.


No funding support for this paper since this is a theory paper.