Data and code from: Alternative pathways into the deep sea: Patterns in Bivalvia
收藏NIAID Data Ecosystem2026-05-10 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.8931zcs4m
下载链接
链接失效反馈官方服务:
资源简介:
Relatively few clades have colonized the deep sea. Here, we analyze evolutionary pathways into this harsh environment, as a continuum defined by two potential end-members—a "piecemeal model", with exclusively deep-sea species (deep-sea endemics herein) derived from multiple, independent entries, and an “in-situ diversification model” with one entry followed by species proliferation. We focus first on two ancient, distantly related subclades in Class Bivalvia, Mytilidae and Lucinidae, each with hundreds of species occurring globally from the intertidal to abyssal plains. Placing bathymetric ranges into newly inferred molecular phylogenies, we find that the deep-sea endemics within Lucinidae derive in piecemeal fashion, estimating up to 16 phylogenetically isolated entries and one modest in-situ diversification. Mytilidae entered the deep sea just four times, with most endemics stemming from the prolific in-situ diversification of Bathymodiolinae. We suggest that the contrasting phylogenetic patterns of entry and proliferation in these clades may be determined by differences in ancestral adult feeding modes. Across Bivalvia, we find that under half of extant families have deep-sea endemics, with the piecemeal model occurring slightly more often. Taken together with other clades, we suggest that evolutionary pathways to deep-sea endemicity are as often shaped by multiple, independent events as by in-situ diversification.
Methods
The following text is derived from the Methods found in the associated paper's main text and supplementary materials. References (some of which pertain to justifying methods) have been removed, but may be found in their respective texts.
Database of marine bivalve taxa and bathymetries
We used a comprehensive dataset of 8402 extant marine bivalve species compiled from Huber 2015 and the subsequent literature (Supplementary Dataset S1). For this dataset, subgenera were elevated to "operational genera" to increase sampling of hypothesized monophyletic groups that can be analyzed for their differences in bathymetric occurrences. Owing to local topological uncertainties, we classified bivalve taxa into 86 operational families for the 100 taxonomic families represented in the initial Crouch et al. (2019) dataset, as follows: Neilonellidae, Tindariidae, Sareptidae and Yoldiidae were treated as Malletiidae sensu lato; Bathyspinulidae, Lametilidae, Phaseolidae, and Siliculidae were treated as Nuculanidae sensu lato; Nucinellidae was treated as Solemyidae; Basterotiidae, Galeommatidae, and Lasaeidae were treated as Galeommatoidea; Cyamiidae, Gaimardiidae, Galatheavalvidae, Sportellidae was treated as Cyamioidea; Xylophagaidae was treated as Teredinidae, and Condylocardiidae was treated as Carditidae. Two entirely shallow-water families, Cardiliidae (4 species) and the monospecific Clistoconchidae, were dropped from the dataset due to uncertainty in their phylogenetic placements, yielding a final dataset of 8327 species classified in 84 operational families.
We used the time-calibrated, family-level phylogeny of Crouch et al. in analyses that spanned Bivalvia. Species with minimum depths < 200 m were considered “shallow” and species with minimum depths > 200 m were “deep” (i.e., “deep-sea endemics”; Supplementary Dataset S1); the attenuation of primary productivity below this depth makes it a biologically meaningful boundary between bivalve faunas. Maximum depths are more likely to be biased by variations in sampling intensity or stray specimens.
Phylogenetic inference for Mytilidae and Lucinidae
We used the taxise function in the PhyLoTaR pipeline to produce a list of all daughter NCBI taxon IDs from a local copy of the NCBI GenBank nucleotide database (accessed January 2024) for Lucinidae and Mytilidae, using a custom perl script to extract species IDs. We cross-referenced this taxon list with our inventory of described mytilid and lucinid genera (see main text for citations), adding sequences from GenBank not discovered via the PhyLoTaR pipeline, and pruning invalid taxa. For each taxon, we combined sequences into a single fasta file, removed duplicate accessions, and parsed sequences by loci using SuperCrunch and a custom locus search term file.
Using SuperCrunch, we assigned reference sequences for the two taxonomic families to perform similarity filtering of the parsed, locus-specific files, aiming to maximize phylogenetic coverage. For each mytilid and lucinid subfamily, one or two reference sequences were selected for their length, prioritizing sequences with museum vouchers. Reference sequences were included for outgroup taxa Pinna and Atrina (Pinnidae) for Mytilidae, and Thyasira (Thyasiridae) for Lucinidae. Reference sequences for 12S, 16S, cytochrome B, cytochrome oxidase 1, ATP synthase subunit 6, NADH dehydrogenase subunits 2 and 4, histone 3, 18S and 28S were selected and used for similarity filtering. Sequences not identified to the specified locus were omitted (e.g. “complete mitochondrial genomes” or “small subunit rRNA”). Due to this limitation, reference sequences for the outgroup taxa were sometimes unavailable, so we instead included sequences from other closely related families: Pinctada fucata (Pteriidae), Crassostrea madrasensis (Ostreidae), Margaritifera margaritifera (Margaritiferidae), and Chlamys islandica (Pectinidae) for ATP synthase subunit 6, NADH dehydrogenase subunit 2, and NADH dehydrogenase subunit 4, respectively. For Lucinidae, a sequence of Parathyasira equalis (Thyasiridae) was included in the reference file for cytochrome B. To increase confidence in the reference sequences for Lucinidae, we prioritized the longest sequences provided by research groups specializing in lucinid systematics, rather than the longest sequences.
Using our reference files, we first performed similarity searches for the sequence data within SuperCrunch using a discontiguous megablast algorithm. Sequences were allowed to match up to fifteen loci. We selected one sequence per species from these locus-specific files (minimum length of 200 base pairs). Genes with fewer than ten sequenced taxa were removed. We also removed the following mytilid species that resolved outside their expected genera and subfamilies in preliminary phylogenetic analyses, had abnormally long branch lengths, or had uncertain identifications: Bathymodiolus elongatus (NCBI TXID 385530), Brachidontes setiger (NCBI TXID 2694962), Brachidontes variabilis (NCBI TXID 102295), Leiosolenus nasutus (NCBI TXID 857499), Modiolatus hanleyi (NCBI TXID 499949) and Modiolus rectus (NCBI TXID 410357). After tree inference, we removed five freshwater mytilid species and three tips resolving outside their respective genera and subfamilies: Lithophaga curtus (NCBI TXID 2590090), Modiolarca impactus (NCBI TXID 2570230), and Mytella brasiliensis (NCBI TXID 356385).
We continued with the SuperCrunch pipeline to construct molecular alignments from the sorted, filtered and trimmed GenBank accessions, resulting in a five-gene matrix for Lucinidae (16S, cytochrome B, cytochrome oxidase 1, 18S and 28S), and a ten-gene matrix for Mytilidae (12S, ATP synthase subunit 6, NADH dehydrogenase subunits 2 and 4, histone 3, and the genes used for Lucinidae). We partitioned these concatenated alignments and determined the best nucleotide substitution models using PartitionFinder2, and used the results to infer 100 non-parametric bootstrap trees for each family. The summarized consensus trees—after minor pruning—were the basis for all figures and downstream analyses.
For both Mytilidae and Lucinidae, we used the resulting relative-rate trees with branch lengths scaled to nucleotide substitution rates in comparative analyses, as the methods used here for the two focal clades do not require time-scaled branch lengths to estimate discrete character transitions. We consider the number of substitutions per site to be the most balanced basis for estimating evolutionary distance across both families, because the subclade affinities of their early-branching members are areas of active research; successfully calibrating these nodes in the future will help to reduce any effects of unevenly constrained branch lengths that can result from less dense calibrations (e.g. skewed likelihood calculations of trait changes and transition rate estimation).
Phylogenetic simulation and permutation tests of piecemeal vs. in-situ diversification
We used the phylogenies of Mytilidae and Lucinidae to estimate both the number of transitions from shallow to deep habitats and the richness of the resulting deep-sea lineages against patterns resulting from stochastic character evolution. We take a two-pronged approach to comparing the evolutionary patterns of deep-sea taxa to a null model for each clade. In one, the "simulation" test, stochastic character histories are modeled on the transition matrix between deep and shallow states, estimated using the phylogenies we inferred with empirical data. In the second, the "permutation" test, we shuffle shallow and deep tip states while holding constant the total number of deep-sea endemics in each clade, thereby allowing the transition rates between the "shallow" and "deep" states to vary by permutation. We compare the outcomes of both procedures to the number of transitions estimated using the observed phylogeny, and the number of deep-sea endemic species directly descending from each transition. Together, these two measures form a solution space for assessing the continuum from piecemeal to in-situ evolution by evaluating whether: 1) “observed” transitions to the deep sea occurred more or less frequently than the null expectations, and 2) these transitions are associated with elevated numbers of deep-sea endemic species.
For the observed values, we estimated the number of transitions and the number of deep-sea tips descending from each one by simulating discrete trait evolution (“shallow” or “deep”) 1000 times along each branch of the mytilid and lucinid phylogenies (Figs. S1-S2; phytools “make.simmap”). We fixed the root state as "shallow" for both families given paleontological evidence (see Results), and fitted transition matrices using the model of character evolution best supported by both phylogenies, the "All Rates Different" model. For each stochastic character map, we identified transitions as branches where the initial state was “shallow” and the terminal state was “deep.” We then tallied the number of tips representing deep-sea endemic species directly subtended by these transitions, excluding tips in nested lineages showing more recent reversals (i.e. “deep” to “shallow” to “deep”). We estimated the median and 95 % confidence interval number of deep-sea endemics per transition across the 1000 simulated character histories described above, ranked by their richness.
For each family, we compared this rank-order profile of transitions to a stochastic pattern of tip states generated from 1000 simulations of discrete character histories under the observed state transition rate matrix. Transition matrices were fitted to each clade using an extended Markov k-state (Mk) model for discrete character evolution assuming the "All Rates Different" different model (see phytools “fitMk”). The ancestral state for each simulation was assumed to be shallow as in the estimates of the observed patterns described above. We then calculated, for each simulation, the locations of transitions and the number of deep-sea tips directly subtending them as described above. Finally, we ranked transitions by deep-sea richness, and calculated the median and 95 % confidence interval for each rank.
We also conducted a permutation-based (tip-shuffling) analysis wherein we shuffled 1000 times the "deep" and "shallow" tip states on each fixed tree. The number of deep and shallow tips was fixed to the observed value but the transition rate matrix was allowed to vary in these permutation tests. Observed diversification histories with ranked transitions having higher deep-sea endemic richness than expected by chance, along with more steeply declining rank-order richness profiles (indicating the concentration of deep-sea endemics among fewer transitions), align more closely with the in-situ model of diversification; piecemeal patterns align with stochastic expectations. Both approaches are necessarily limited by taxonomic sampling, so we also performed a related, hierarchical analysis, described below, to test whether all known deep-sea endemics are concentrated within particular genera in their respective families.
Testing potential drivers of deep-sea entries in Lucinidae
We tested whether the phylogenetic positions of lucinid deep-sea endemics are predicted by the bathymetric and climatic occurrences of their “close relatives.” We defined close relatives in two ways to account for phylogenetic resolution and taxon coverage. In the "strict scheme", taxa were considered close relatives if they form moderate- or high-confidence clades (bootstrap support > 50 %) with deep-sea endemic taxa in the phylogeny represented in Fig. S2 (N = 21 species; marked by asterisks). The "broad scheme" used the close relatives in the first scheme plus the species absent from the molecular phylogeny but belonging to genera that include or are sister to deep-sea endemics (N = 69 species for climate analyses, N = 67 species for depth; Fig. S2). For example, close relatives in Leucosphaerinae include all shallow-water species in Alucinoma, Anodontia, Gonimyrtea, and Myrtina in the broad scheme. Further, as there is at least one deep-sea endemic species in the lucinine genera Cardiolucina, Lamellolucina, and Megaxinus, any shallow-water species in those genera were considered close relatives in this broad scheme, despite the fact that the latter two genera are absent from our molecular phylogeny. Lucinoma includes numerous deep-sea endemic species and is potentially non-monophyletic depending on the placement of Divalucina, so this scheme also considers all shallow-water Lucinoma and Divalucina to be close relatives. Most taxa in Myrteinae are deep-sea endemics and the clade has ~40 operational species, so we consider any shallow-water myrteine (including those in genera lacking molecular data) to be a close relative under the broad scheme. Due to their polyphyly and low resolution, five operational genera (Ferrocina, Gonimyrtea, Parvilucina, Pegophysema, and Myrtea sensu lato (i.e., species excluded from Myrtea sensu stricto) were not considered close relatives despite having endemic deep-sea congenerics (Fig. S2). An exception was made for Gonimyrtea concinna as it falls sister to the endemic deep-sea Ustalucina ferruginea (BSS = 91 %; Fig. S2). Dataset S1 contains designations of close relatives under both schemes.
For both sets of close relatives, we used one-sided Kolmogorov-Smirnov tests to assess whether lucinid deep-sea endemic species are more likely to have close relatives reaching greater depths than other shallow-water species. We used permutation and Fisher's exact tests to assess whether extant close relatives of deep-sea endemics are more likely to be found in extratropical (i.e. colder) waters. We associated each shallow-water species with biogeographic provinces as defined by Huber 2015, coding them as tropical, extratropical, or “both” (one-sided Fisher’s exact tests omitted the "both" category).
We also assessed the climatic context (tropical, extratropical, or both/uncertain) of the first occurrences of lucinid genera with a known fossil record, using a compendium of first and and last occurrences of marine bivalves and the primary literature. Fossil data allowed us to assign climates for 11 of the 22 lucinid genera with at least one extant deep-sea endemic species, and for 56 of the 75 lucinid genera today lacking deep-sea endemics (Supplementary Dataset S1). We excluded genera in "both/uncertain" categories and used a one-sided Fisher’s exact test to evaluate whether lucinid genera with at least one extant deep-sea endemic were more likely to have originated in extratropical settings.
Testing the taxonomic concentration of deep-sea endemics across Bivalvia
Using the full database of bivalve bathymetries, we tested whether families containing at least one deep-sea endemic species are phylogenetically clustered on the family-level tree (via the “Fritz & Purvis's D” statistic in caper). We then tested whether the proportion of deep-sea endemics in a family is phylogenetically clustered using Pagel’s λ. To test whether the relative richness of deep-sea endemic species within families differs from the stochastic expectation, we randomly shuffled depth assignments 1000 times, keeping constant the size of every family. We next tested whether deep-sea endemic species are concentrated within certain genera across families by randomly shuffling depth categories for species within families 1000 times, and counting the number of genera with at least one deep-sea endemic.
创建时间:
2025-12-05



