five

Convergent rates of protein evolution identify novel targets of sexual selection in primates

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.fn2z34v23
下载链接
链接失效反馈
官方服务:
资源简介:
Sexual selection is the differential reproductive success of individuals, resulting from competition for mates, mate choice, or success in fertilization. In primates, this selective pressure often leads to the development of exaggerated traits which play a role in sexual competition and successful reproduction. In order to gain insight into the mechanisms driving the development of sexually selected traits, we used an unbiased genome-wide approach across 21 primate species to correlate individual rates of protein evolution to relative testes size and sexual dimorphism in body size, two anatomical hallmarks of sexual selection in mammals. Among species with presumed high levels of sperm competition, we detected strong conservation of testes-specific proteins responsible for spermatogenesis and ciliary form and function. In contrast, we identified accelerated evolution of female reproductive proteins expressed in the vagina, cervix, and fallopian tubes in these same species. Additionally, we found accelerated protein evolution in lymphoid tissue, indicating that adaptive immune functions may also be influenced by sexual selection. This study demonstrates the distinct complexity of sexual selection in primates revealing contrasting patterns of protein evolution between male and female reproductive tissues. Methods Analyses were conducted using the Mammals Multiz Alignment & Conservation (27 primates) amino acid alignments from the UCSC genome browser. This dataset includes 27 primate species and 3 outgroups, containing 19,975 amino acid sequence alignments of known human genes aligned against the translated sequence from the genome assemblies of other species. For each alignment branch lengths were estimated using the aaml program in the Phylogenetic Analysis by Maximum Likelihood (PAML) package version 4.9 (Yang 2007, Mol Biol Evol 24:1586-91), using an empirical model of amino acid substitution rates with rate variability between sites modeled as a gamma distribution approximated with four discrete classes and an additional class for invariable sites (aaml model ‘Empirical + F’). For each primate species, relative testes size was calculated with consideration to the allometric relationship between testes mass and body mass. This was calculated as the logarithm of the ratio of observed testes size compared to the expected testes size generated using the function Y = 0.034X0.68  where Y is the expected mass of both testes in grams and X is the observed body mass in grams (Kenagy and Trombulak 1986, J Mammal 67:1-22). Values of body mass for males and females of each species were taken from Smith and Jungers (1997, J Hum Evol 32:523-9), except for those of the drill because of their small sample size; instead, drill body mass was taken from Setchell (2017, Intl Enc Primatol). Body size dimorphism was calculated as the log-transformed ratio of male body weight over female body weight. Ancestral states were estimated through phylogenetic modeling based on extant species values using the phytools R package (Revell 2012, Methods Ecol Evol 3:217-23). To perform our primary analysis, we used the RERconverge pipeline (Kowalczyk et al. 2019, Bioinformatics 35:4815-17) which quantifies the correlation between the rate of change of a trait and the rate of protein evolution allowing for the identification of proteins which are evolving in response to selection related to a particular phenotype along independent lineages. Using linear regression, we identified the degree of divergence for each protein from its expected rate of evolution to calculate the relative evolutionary rate (RER). Phenotype vectors and calculated RERs for all branches were correlated using Pearson linear correlations and were adjusted using the Benjamini-Hochberg procedure as well as 1000 phylogenetically restricted permutations, or permulations, as described in Saputra et al. (2021, Mol Biol Evol 38:3004-21). Expression summaries were collected from the Human Protein Atlas (proteinatlas.org) for each nominally significant protein for both traits (uncorrected p < 0.05) including tissue expression (Uhlén et al. 2015, Science 347:1260419) and single-cell type expression data (Karlsson et al. 2021, Sci Adv 7:eabh2169). Tissue enrichment analyses were performed for 38 tissues where a Wilcoxon Rank-Sum statistic was estimated using RERConverge commands to compare the rho value multiplied by the negative log of the p-value for correlations of genes within the tissue against the same value for all genes. Metascape (https://metascape.org) with default parameters was used for Gene Ontology enrichment analysis of conserved and accelerated proteins with uncorrected p-values and permulation p-values less than 0.05 (Zhou et al. 2019, Nat Commun 10:1523). To differentiate between positive selection and relaxed constraint for the top 200 proteins identified in the RER correlation for each trait, ranked by p-value, we used codeml from the PAML package (Yang 2007) to fit the data (alignment of coding regions from the Mammals Multiz Alignment & Conservation nucleotide alignments) to a model constrained to a single ratio of the nonsynonymous substitution rate to the synonymous substitution rate (dN/dS) across all branches of the phylogeny (uniform ratio model, M0). The likelihood of two other models were estimated, a branch model where two sets of branches (foreground and background) were each allowed to have their own dN/dS, and another branch model where the foreground was constrained to neutrality (dN/dS = 1). For each trait, in these models the terminal branches leading to any extant species in the top one-third of the trait value were specified as foreground, and the remaining branches as background. To determine if the foreground and background branches are evolving differently as fit to these models, a likelihood ratio test (LRT) compared the likelihoods of the two-branch model to the uniform model with significance assessed using a chi-square test with one degree of freedom. To test for positive selection across the entire gene, an LRT compared the two-branch model to the neutral foreground model; the was considered as evidence for positive selection if the result was statistically significant and the foreground dN/dS was greater than one.
创建时间:
2023-10-19
二维码
社区交流群
二维码
科研交流群
商业服务