Coevolution of cooperative lifestyles and reduced cancer prevalence in mammals
收藏NIAID Data Ecosystem2026-05-10 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.xgxd254vh
下载链接
链接失效反馈官方服务:
资源简介:
Why cancer is so prevalent among mammals, despite the fact that some species evolved resistance mechanisms, remains an open question. We hypothesized that cancer prevalence and mortality risk might have been fine-tuned by evolution. Using public databases, we show that species with cooperative habits have lower cancer prevalence and mortality risk. By developing a mathematical model, we provide a mechanistic explanation: an oncogenic variant that elicits higher cancer mortality in older and less reproductive individuals is detrimental to cooperative mammalian societies but can lead to a counterintuitive overcompensation in population size and fitness within competitive contexts. The phenomenon of a population increasing in response to a decrease in its per capita survival rate is called the hydra effect, a process never explored in the field of cancer before. Therefore, cancer can be considered as a selected mechanism of biological obsolescence in competitive species.
Methods
See the following DOI: 10.1126/sciadv.adw0685 (available on November 12, 2025)
CMR, neoplasia and malignancy prevalence in mammalian species
First dataset: Cancer Mortality Risk (CMR) was calculated for each species as the proportion of cancer-related deaths out of the total number of records, based on post-mortem pathological records (n=11,840). This information was sourced from Species360 and The Zoological Information Management System (ZIMS). The dataset initially included 191 species, but D. byrnei was removed due to its extremely high CMR, which was considered an outlier. This CMR data was gathered from mammals in zoos worldwide, providing high-resolution cause-of-death data. CMR was estimated from neoplastic samples that substantially contributed to the animal death, as confirmed by necropsies. The CMR estimated for each and every species included in this dataset is based on more than 20 necropsies per species (mean = 62).
Second dataset: Prevalence of neoplasia was estimated as the prevalence of any neoplasm in mammalian species from San Diego's zoos. The dataset initially included 37 species, but L. africana was removed due to incongruences with other publications reporting lower cancer rates. The prevalence of neoplasia estimated for the species included in this dataset is based on an average of 23 necropsies per species. Vulpes zerda, Puma concolor, Canis mesomelas, Lama glama, Lycaon pictus, Tarsius syrichta, Macropus rufus and Equus asinus are the only species with less than 10 necropsies analyzed.
Third dataset: We used a recently curated and standardized dataset of malignancy prevalence across mammalian species that is based on more than 20 necropsies per species. This resource includes additional species not considered in the other datasets. In this analysis, a list of archetypal species with very high or very low malignancy prevalence was constructed: all species were ranked according to their malignancy prevalence, and three subsets were defined using different cut-offs: Rank10, Rank15, and Rank20, each including the 10, 15, or 20 species with the highest and lowest malignancy prevalence, respectively. These ranked groups consisting of 20, 30 and 40 species, respectively, were then used for downstream comparative analyses. The total dataset comprised 102 mammalian species.
Morpho-physiological, life history and lifestyle traits
Data on Body Mass (kg) and Life Expectancy (days) used for the first dataset have been extracted from Vincze et al. (n=190 species). Data on Adult Mass (kg,) and Maximum Lifespan (days) used for the second (n=32 and n=36 species, respectively) and third databases (n=94 species, in both cases) was obtained from the COMBINE database. Data on Metabolic Rate (n=52 for the first dataset, n=31 for the second dataset and n=52 for the third dataset) was obtained from the AnAge database and expressed in Watts (W). For the third database, a categorization was made for variables Adult Mass, Metabolic Rate and Maximum Lifespan, in order to divide the species into two categories, with a threshold such as to have two groups with a comparable number of species.
We defined life history traits (Litter Size, Litters, Gestation Length, Life Expectancy and Maximum Lifespan) as those that depend on the history of the individual but are not clearly behavioral like lifestyle traits (Group Living, Breeding System). We chose litter size, gestation time and life expectancy as three classic life history traits. In particular, life expectancy is a well-determined variable in many species, which helps to have a larger sample size.
Data on Litter Size (mean number of descendants per female, n=190 species for the first dataset, n=32 for the second dataset and n=94 for the third dataset) and Gestation Length (days, n=190 for the first dataset and n=32 for the second dataset) was obtained from the COMBINE database. The variable "Litters" was used to classify species as either monotocous or polytocous, using a litter size of 1.5 as threshold. Transforming Litter Size into a dichotomous variable allowed us to statistically test its interaction with body mass, similarly to what we did with dichotomous variables such as Group Living. Total Litters was calculated as the number of litters per year multiplied by litter size and the difference between the maximum longevity and female sexual maturity for each species. All the data for the calculations were obtained from the COMBINE database. Group Living (n=144 species for the first dataset, n=24 for the second dataset and n=77 for the third dataset) was determined by integrating data from two sources: Pérez-Barberia et al. and Lukas & Clutton-Brock. The variable is dichotomous, indicating whether a species engages in group living based on regular associations among individuals. A species was classified as Group Living if it showed sociality or was listed as group living by either source. Conversely, it was classified as not having Group Living if it exhibited no sociality or was listed as solitary or socially monogamous by either source. When data from both sources were available, a species was included only if both sources agreed, otherwise it was either excluded, or a choice was made based on available literature. Data on Breeding System (singular breeders or plural breeders, n=147 species for the first dataset, n=28 for the second dataset and n=79 for the third dataset) was gathered from Lukas & Clutton-Brock. The category of singular or plural breeders was assigned if the females occupy a separate or common territory or range during the breeding season, respectively. Data on the dichotomous variable Paternal Care (n=157 species for first dataset and n=29 for second dataset) was also obtained from Lukas & Clutton-Brock.
Data on Animal Diet (consumption of animals, including vertebrates and invertebrates) was sourced from Vincze et al., who compiled the information from a global mammalian diet database. This dataset categorizes dietary components into four hierarchical levels: never consumed, occasionally consumed, secondary food item, and primary food item. For our analysis, we focused solely on whether animal matter was present in the diet, without differentiating between specific types. Since the intermediate categories (occasional and secondary consumption) included relatively few species, Vincze et al. consolidated the dietary classifications into two broader levels: rarely/never consumed and regularly consumed (i.e., as a primary or secondary food source). Diet information was included only for the first dataset, due to the strength of the analysis and the sample size available.
Statistical analysis
Correlations of CMR and neoplasia with the different traits were performed employing phylogeny-corrected generalized linear mixed models (phylGLMM) using phyr in R Statistical and Programming Environment, version 4.2.3. Previous investigations with the species included in these datasets showed there is a phylogenetic signal for CMR and neoplasia among mammal species. To control for phylogenetic relatedness among species we performed phylGLMM models using the original robust phylogeny by Vincze et al. phylGLMMs used a binomial error distribution and a logit link function, adding a random variable at the level of observations to avoid overdispersion problems. This random variable, called “Species”, was constructed with the identity of each species analyzed. Not all analyses using CMR data were performed with the full set of species, since the information for some of the traits analyzed was not available for all species. All models performed were evaluated for overdispersion and zero-inflation using DHARMa package. All model tests showed p-values > 0.05, which indicates that no fit problems were detected and therefore, unlike previous investigations, we chose to perform the analyses using species with both zero and non-zero CMR.
Models used:
(1) An additive phylGLMM was performed with CMR as response variable and log transformed continuous variables of covariate traits Body Mass, Litter Size, Life Expectancy and Gestation Length. Log transformed variables were used as fixed effects, and Species as a random variable at the observation level. The physiological trait Metabolic Rate was also log transformed and analyzed in a separate model to avoid collinearity problems with Log Body Mass.
(2) A simple model for dichotomous variable Litters was performed to test for CMR differences in monotocous or polytocous species. This dichotomous variable was tested in a model with continuous variables Log Life Expectancy and Log Body Mass to evaluate interaction.
(3) For lifestyle dichotomous variables Group Living, Breeding System and Paternal Care we performed separate analyses to avoid collinearity problems, in all cases with CMR as the response variable, and Species as a random variable at the level of observations. For Group Living and Breeding System variables we also performed models with the continuous variables (Log Body Mass, Log Litter Size, and Log Life Expectancy) and tested the interaction with Log Body Mass.
(4) Animal Diet as a dichotomous variable was analyzed using CMR as the response variable, and Species as a random variable at the level of observations. The association between Animal Diet and CMR was also assessed in relation to the other life history and lifestyle traits using four different models that include species of Animal Diet and Group Living (gregarious/solitary) and Breeding System (singular/plural).
(5) To perform an order level analysis, mean CMR for all the species belonging to each order with at least 15 species (i.e. Artiodactyla, Carnivora, Primates, Rodentia) was calculated. We also built indexes for each trait of interest: (a) Litters Index: ratio between monotocous and polytocous species within each order, (b) Group Living Index: ratio between the species with and without Group Living within each order, and (c) Breeding System Index: ratio between plural and singular breeding species within each order. The analysis was performed with GLMs employing a binomial error distribution and a logit link function, using the glmmTMB package (table S4). The total set of p-values derived from analysis using CMR data was corrected for multiple testing using FDR correction.
(6) Neoplasia data on 36 species from the second dataset was analyzed using the same phylGLMM simple models with one variable per model as before, but with a different phylogeny of the 36 mammal species constructed from the updated mammalian super-tree. The same data for the different morpho-physiological, life history and lifestyle traits as before was used, with the exception of Log Body Mass and Log Maximum Lifespan where the analyses were performed with Adult Mass (kg) and Maximum Lifespan (days) from Boddy et al. The total set of p-values derived from analysis using this dataset was corrected for multiple testing using FDR correction. Statistical analyses for dichotomous variables were not performed on this data set because the power of the model is not strong enough to test small samples.
The analyses of the archetypal species with the highest or lowest levels of malignancy prevalence from the third database were performed qualitatively. For each dichotomous variable, a group was judged to be more enriched in species with a high prevalence of malignancies if we observed differences greater than 50% in each and every one of the three ranks (cut-offs 10, 15, and 20) and only if these differences became larger as we narrowed the rank (which is expected to occur if there is a direct relationship between both variables).
Mathematical modeling and simulation
We developed a system of ordinary differential equations (ODEs) representing a consumer population of any mammal species depending on its resources for subsistence.
The population is stage-structured based on age: pre-reproductive juveniles (J), reproductive adults (A), and senior post-reproductive adults (S). This emphasizes the reproductive capacity of individuals depending on age. Resources (R) are kept unstructured, and their dynamics are governed by a production function and by consumer foraging f. The function does not increase with the resource density R and is therefore independent of it. This allows the system to reach a steady state of non-negative values. Furthermore, consumer foraging is organized in stage-specific functions, fJ , fA and fS (all of which depend on R) associated with the age of the consumer group. The resulting resource intake is translated into physiological processes with an efficiency given by non-negative and non-decreasing functions gA (fertility rate) and gJ (rate of juvenile sexual maturation to become reproductive adults). The senescence process, given by σ > 0, represents a fixed quantity by which adults age into senior individuals. Likewise, all per-capita mortality rates (μJ, μA, μS) are positive constants. Changes in cancer mortality were modeled through the value of the μS parameter. Dependence on resources (R) for vital processes (e.g., transition rates between life stages) conveys non-social intraspecific competition between life stages. Effects of non-social competition are focused on changes in stage density distribution, as the lack of resources limits population growth by preventing juveniles from reaching adult stage and the latter from having further offspring. Such life history processes govern transition rates between life stages. In this model, α represents the strength of social intraspecific cooperation. Positive values of α are interpreted as supportive or caring interactions of older individuals towards juveniles. Increasing α values result in a decrease of juvenile mortality, while α = 0 does not take into account these phenomena (α cannot adopt negative values). Similarly, ω > 0 stands for the strength of social intraspecific competition between seniors and other individuals. Increasing ω values result in higher juvenile mortality (Eq. 1b). In our model, when α adopts positive values, we set ω to zero, and vice versa. Intraspecific competition may also be associated with resource density, in which case ρ > 0. Abundance of resources (high values of ρ) diminishes the effect of direct competition. The social parameters α and ω control the non-transitional processes of competitive and cooperative interaction between life stages of the consumer population, respectively. Alternative cooperative or competitive interactions have been also modeled in which the parameter α /ω is the ability of senior individuals (S) to influence juvenile (J) access to resources, juvenile's development into adults (A), or adult's reproductive output.
By mathematical analysis of the ODE system we were able to find necessary and sufficient conditions for the model to display a hydra effect when α, ω = 0, that is to ensure an increase in the carrying capacity of the population (N = J + A + S, the population density at dynamical equilibrium) as a result of increasing the value of the parameter μS (interpreted as higher CMR). If the senior stage has the largest consuming rate value at equilibrium, then, for this model, the specific increase in mortality among seniors will lead to the hydra effect. Conversely, if the senior consumption rate is the smallest one, the intuitive effect of a decreasing population equilibrium with increasing mortality of part of their individuals ensues. Considering the per-capita consumption rate of senior adults lower than that of reproductive adults but higher than that of juveniles (fA > fS * >* J) due to age-related size differences, then the hydra effect is indeed conditioned upon the life history trait parameter values (at equilibrium).
Expressed in this way, we can see that a larger fertility rate value at the system equilibrium gA(fA (R)) is positively associated with the existence of a hydra effect (all else being equal).
For the purposes of simulations, we employed a linearized version of the model to numerically obtain time courses by integration with Python (using Scipy, Numpy, Pandas and Matplotlib). For this end, we chose a semi-chemostat model for the resources dynamics, where the resource production function remains constant, p(R) = π. We also considered linear consuming relations, weighted by different constants associated with a characteristic size and age of the consuming stage, that is fJ (R) = κJ R, fA (R) = κA R, fS (R) = κS R. The resulting resource intake is converted into physiological processes of reproduction and maturation linearly with an efficiency given by the βA (i.e., gA (x) = βA x) and 𝛾A (i.e., gJ (x) = 𝛾A x) parameters, respectively (𝛾A refers to the maturation of juveniles originating from adults, to differentiate them from those originating from seniors, 𝛾S).
Extended versions of these models that allow other processes to occur, such as reproduction of senior adults, distinct survival rates for the senior born juveniles, and finally, the spread of a higher CMR genetic variant on the population were also implemented.
For the model with mixed populations, we started with a population in equilibrium before the introduction of the oncogenic variant (both subpopulations had the same senior mortality rates μS1 = μS2). We performed two types of tests in this initial state. A) We introduced the oncogenic variant by increasing μS2 in half of the population in equilibrium (a process of migration or subpopulation mixing), and we evolved the system to its equilibrium frequencies. B) We introduced the variant as a mutation (low initial frequency), transferring 5% of the juveniles from subpopulation 1 to subpopulation 2 (higher μS2) and monitored relative frequency time evolution as well.
The direct fitness of a genotype Gx was calculated as NGx(t)/NGx(0), where NGx(t) is the density over time of subpopulation with the genotype Gx, and N(0) is its density at time t = 0. The indirect fitness of the metapopulation was calculated as (NG1(t)+NG2(t))/(NG1(0)+NG2(0)). The relative frequency of each gene variant over time was calculated as NGx(t)/(NG1(t)+NG2(t)).
References
O. Vincze, F. Colchero, J.-F. Lemaître, D. A. Conde, S. Pavard, M. Bieuville, A. O. Urrutia, B. Ujvari, A. M. Boddy, C. C. Maley, Cancer risk across mammals. Nature 601, 263–267 (2022).
A. M. Boddy, L. M. Abegglen, A. P. Pessier, A. Aktipis, J. D. Schiffman, C. C. Maley, C. Witte, Lifetime cancer prevalence and life history traits in mammals. Evolution, medicine, and public health 2020, 187–195 (2020).
Z. T. Compton, W. Mellon, V. K. Harris, S. Rupp, D. Mallo, S. E. Kapsetaki, M. Wilmot, R. Kennington, K. Noble, C. Baciu, Cancer prevalence across vertebrates. Cancer discovery 15, 227–244 (2025).
D. Lukas, T. Clutton-Brock, Monotocy and the evolution of plural breeding in mammals. Behav Ecol 31, 943–949 (2020).
F. J. Pérez‐Barbería, S. Shultz, R. I. Dunbar, Evidence for coevolution of sociality and relative brain size in three orders of mammals. Evolution 61, 2811–2821 (2007).
D. Lukas, T. H. Clutton-Brock, The evolution of social monogamy in mammals. Science 341, 526–530 (2013).
W. D. Kissling, L. Dalby, C. Fløjgaard, J. Lenoir, B. Sandel, C. Sandom, K. Trøjelsgaard, J. Svenning, Establishing macroecological trait datasets: digitalization, extrapolation, and validation of diet preferences in terrestrial mammals worldwide. Ecology and Evolution 4, 2913–2930 (2014).
D. Li, R. Dinnage, L. A. Nell, M. R. Helmus, A. R. Ives, phyr: an R package for phylogenetic species‐distribution modelling in ecological communities. Methods in Ecology and Evolution 11, 1455–1463 (2020).
R. C. Team, R: A language and environment for statistical computing. R Foundation for Statistical Computing. (No Title) (2013).
A. F. Zuur, E. N. Ieno, N. J. Walker, A. A. Saveliev, G. M. Smith, Mixed Effects Models and Extensions in Ecology with R (Springer, 2009)vol. 574.
X. A. Harrison, Using observation-level random effects to model overdispersion in count data in ecology and evolution. PeerJ 2, e616 (2014).
F. Hartig, L. Lohse, DHARMa: Residual Diagnostics for Hierarchical (Multi-Level/Mixed) Regression Models. 2022. R package version 0.4 6.
M. E. Brooks, K. Kristensen, K. J. Van Benthem, A. Magnusson, C. W. Berg, A. Nielsen, H. J. Skaug, M. Machler, B. M. Bolker, glmmTMB balances speed and flexibility among packages for zero-inflated generalized linear mixed modeling. The R journal 9, 378–400 (2017).
Y. Benjamini, Y. Hochberg, Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal statistical society: series B (Methodological) 57, 289–300 (1995).
O. R. Bininda-Emonds, M. Cardillo, K. E. Jones, R. D. MacPhee, R. M. Beck, R. Grenyer, S. A. Price, R. A. Vos, J. L. Gittleman, A. Purvis, The delayed rise of present-day mammals. Nature 446, 507–512 (2007).
A. M. de Roos, When individual life history matters: conditions for juvenile-adult stage structure effects on population dynamics. Theoretical Ecology 11, 397–416 (2018).
P. Virtanen, R. Gommers, T. E. Oliphant, M. Haberland, T. Reddy, D. Cournapeau, E. Burovski, P. Peterson, W. Weckesser, J. Bright, SciPy 1.0: fundamental algorithms for scientific computing in Python. Nature methods 17, 261–272 (2020).
C. R. Harris, K. J. Millman, S. J. Van Der Walt, R. Gommers, P. Virtanen, D. Cournapeau, E. Wieser, J. Taylor, S. Berg, N. J. Smith, Array programming with NumPy. Nature 585, 357–362 (2020).
J. D. Hunter, Matplotlib: A 2D graphics environment. Computing in science & engineering 9, 90–95 (2007).
G. Van Rossum, F. L. Drake, Python/C Api Manual-Python 3 (CreateSpace, 2009).
W. McKinney, “Data structures for statistical computing in Python.” (2010)vol. 445, pp. 51–56.
创建时间:
2025-10-30



