Caecilians maintain a functional long-wavelength-sensitive cone opsin gene despite signatures of relaxed selection and more than 200 million years of fossoriality
收藏NIAID Data Ecosystem2026-05-10 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.h18931zxf
下载链接
链接失效反馈官方服务:
资源简介:
Vision evolves in response to species’ light environments and ecological needs. The transition to fossoriality may relax constraints on vision, leading to reduced visual capabilities. Caecilians (Gymnophiona)—specialized fossorial amphibians—possess reduced eyes covered by skin or bone. These traits, together with the assumption of a single photoreceptor type expressing one opsin gene, have long been interpreted as evidence of limited vision, including an inability to focus or perceive color. Here we provide genomic, transcriptomic, and anatomical data that challenge some of these assumptions. We identified the long-wavelength-sensitive (LWS) opsin gene in 13 caecilian species spanning 8 of 10 recognized families, with alignments showing no frame-shift mutations or premature stop codons, consistent with maintained functionality. Molecular evidence further demonstrates that LWS is transcribed in the eye of Caecilia orientalis. Selection analyses indicate that LWS is broadly under purifying selection, which may explain its persistence despite hundreds of millions of years of fossoriality, but also show evidence of relaxed constraint, suggesting reduced reliance on canonical cone-mediated vision. The specific photoreceptor type expressing LWS remains uncertain: rod phototransduction genes are largely intact, whereas cone pathway genes display a mosaic pattern of losses. Further, anatomical surveys across five families did not conclusively identify cone-like cells, though organized retinal layers were observed even in species with highly reduced eyes. The datasets in this submission include: (i) alignments of phototransduction and opsin genes; (ii) histological photographs from several caecilian species; and (iii) input, output, and summary files from codeml selection analyses. Together, these resources support further exploration of gene integrity, phototransduction pathways, and the evolution of vision in fossorial vertebrates.
Methods
Morphology and search for cone photoreceptors
Histology: The examined sections are archival material accessioned in the MVZ that were originally prepared by MHW in the 1980s. Details on sample preparation, staining, and storage are provided in Supplementary Methods section 2.2.1. The file is also accompaning this submission. Sampling was opportunistic, representing all species with available histological sections suitable for visualizing eye morphology. More information about these species and accession information can be found in Supplementary Table 2.
Photomicrography: Eyes were observed with a Laxco microscope, where we examined sections with intact photoreceptors and retina. Photographs were taken at standard low, medium, and high magnifications using a Laxco SeBaCam5C digital microscope camera, and rendered using SeBaView software v3.7 (Laxco, Inc.). A stage micrometer was used for measurements and to calibrate scale bars. A total of 282 photomicrographs of retinae representing 8 of the 10 families of Gymnophiona were examined (see Supplementary Table 2 for specimen information and Supplementary Data 5 for all photomicrographs).
Search for cones and description of the retina: We visually surveyed all images for photoreceptor cells with morphology characteristic of cones (i.e., bulbous ellipsoids, tapered outer segments that are much thinner than their inner segments) or otherwise distinct from the rods that predominate the retina. Slides of Epicrionops sp., Typhlonectes compressicauda, Ichthyophis kohtaoensis, Dermophis mexicanus, and Scolecomorphus kirkii specimens were selected to highlight taxonomic and morphological diversity of retinae across Gymnophiona due to their taxonomic disparity and well-preserved morphology. Standard descriptions of the retinae were made following (Mohun & Wilkinson, 2015; Wake, 1985) and (Himstedt, 1995; Mohun et al., 2010). Details on measurement methodology are provided in Supplementary Methods section 2.2.3.
Further validation of a caecilian LWS and assessment of its potential functionality
Generating an eye transcriptome: We collected one individual of Caecilia orientalis in 2020 at Yanayacu Biological Station in Napo, Ecuador (S 0.5984097, W 77.8907409, 2143 m.a.s.l.) under permit MAE-DNB-CM-2015-0025-M-0001 issued by the Ministerio del Ambiente of Ecuador to Universidad Católica del Ecuador. We euthanized the specimen using commercial Roxicaine anesthetic (lidocaine) and immediately dissected both eyes which were preserved in RNAlater. The specimen is deposited in the collection of Museo de Zoología of the Pontificia Universidad Católica del Ecuador (QCAZ), with specimen number QCAZ:A:77338. Tissue samples were sent to Macrogen Inc. (Seoul, South Korea) for RNA extraction, library preparation using the Illumina TruSeq Stranded mRNA LT Sample Preparation Kit, and sequencing on the Illumina NovaSeq6000 to 100 million paired-end 100-bp reads.
Read quality was assessed using FastQC (Andrews, 2010). We used TrimGalore! (https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) to remove adapters, low-quality bases, and short reads. Transcriptome assembly was then conducted de novo using Trinity v2.13.2 (Grabherr et al., 2011), incorporating all paired-end reads with default settings. The resulting assembly was filtered using SeqClean (Chen et al., 2007). To annotate protein-coding genes, TransDecoder v2.026 (Haas, n.d.) was used with default parameters and homology to UniProt was then used to infer functions of predicted genes with NCBI-BLASTP v2.2.30+ (E-value = 1e−6) and Interproscan 5.35 (Jones et al., 2014) to detect proteins with known functional domains. Finally, we used Salmon v1.10.3 (Patro et al., 2017) to quantify transcript abundances of opsin genes. First, we built a Salmon index for our transcriptome using the index command with default parameters for k-mer length. Next, we quantified the paired-end reads against the previously generated index using the quant command. The output gene counts were summarized as normalized transcript abundances using DESeq2 v3.11 (Love et al., 2014).
Review of prior evidence suggesting loss of LWS: A previous attempt to amplify LWS (also known as OPN1LW) from two species of caecilians (Ichthyophis cf. kohtaoensis and Typhlonectes natans) using eye cDNA from adult specimens was unsuccessful (Mohun et al., 2010). Given that previous work reported a putatively functional copy of LWS in three other species of caecilians (Lin et al., 2024), we wanted to verify whether a mismatch between primer and template could have resulted in a failed PCR. Thus, we used the 'water' local alignment tool from EMBOSS (Rice et al., 2000), which implements the Smith-Waterman algorithm, to review potential alignments between primers from Mohun et al. (2010) and cDNA sequences of three published caecilian LWS genes (Rhinatrema bivittatum, XM_029609805.1; Microcaecilia unicolor, XM_030192059.1; Geotrypetes seraphini, XM_033922369.1). We report values for forward primers aligned to the forward sequence and for reverse primers aligned to the reverse complement of the sequence.
PCR of LWS in additional caecilian species: To validate whether LWS existed in a broader phylogenetic and ecological sampling of caecilians, we obtained tissue samples from terrestrial and aquatic species exhibiting varying degrees of fossoriality and reports of above-ground behavior. We included samples of species from early branching lineages that exhibit lesser degrees of adaptation to fossoriality including Epicrionops bicolor, E. petersi (Rhinatrematidae), as well as more phenotypically derived species such as Caecilia orientalis (Caecilidae) and Microcaecilia albiceps (Siphonopidae) from specimens at QCAZ. Additional tissues were obtained from the Museum of Vertebrate Zoology (MVZ) at the University of California, Berkeley, including terrestrial species that inhabit moist, loose soils with varying levels of eye coverage such as Dermophis mexicanus, Geotrypetes seraphini, Gymnopis multiplicata (Dermophiidae), as well as Ichthyophis kohtaoensis (Ichthyophiidae) which has aquatic larvae and terrestrial adults, and the fully aquatic Typhlonectes natans (Typhlonectidae). Tissues from species with markedly reduced eyes, such as Boulengerula boulengeri (Herpelidae) and Scolecomorphus vittatus (Scolecomorphidae), were kindly loaned from the California Academy of Sciences. We attempted to obtain tissues that match our morphological material to the extent possible (at least to genus). Together, these samples represent the full range of diversity in caecilian eye morphology.
DNA extraction, amplification, and sequencing was carried out at PUCE for QCAZ specimens, and at the Evolutionary Genomics Lab (EGL), UC Berkeley, for MVZ and CAS specimens. Fragments of LWS were amplified using new primer sets designed from an alignment of amphibian opsins. The new LWS_542R (5’-GCA GAC CAA ACC CAG GAG AA-3’) and LWS_117F (5’ TTT TGA GGG MCC CAA CTA CC -3’) primers amplify a region that includes part of exon I, the intron between exon I and II, and part of exon II. Specifics about the extraction and amplification protocols are provided in the Supplementary Material Methods section 2.3.2. Additionally, an evaluation of primers used in previous attempts that did not yield successful amplification of LWS in caecilians (Mohun et al., 2010) is provided in Supplementary Material Methods section 2.4.
PCR products from QCAZ samples were sequenced at Macrogen Inc. (Seoul, South Korea); MVZ and CAS samples were sequenced at the UC Berkeley DNA Sequencing Facility. We assembled forward and reverse sequences into consensus sequences using default parameters in Geneious Prime 2022.1.1 and then aligned them against each other using MUSCLE (Edgar, 2004). To confirm the identity and quality of the new LWS sequences, we inferred gene trees for LWS and RH1 in IQTREE2. Sequences were partitioned by coding and non-coding regions and also by codon position. The best-partition model was estimated using the -TESTMERGE option (Chernomor et al., 2016; Kalyaanamoorthy et al., 2017). Bootstrap values were calculated using 500 replicates (-b 500) and were plotted on the best likelihood tree. See Supplementary Data 1 for input and output files.
Estimation of the wavelength of maximum absorbance (λmax) of LWS and RH1: Given that the LWS copies identified in caecilian genomes contain open reading frames without stop codons (i.e., considereded intact), we hypothesize that the gene is functional and that the protein encoded by LWS will exhibit absorbance spectra within the expected ranges (560–570 nm; (W. I. L. Davies et al., 2012). To test this hypothesis, we used a machine-learning model (Frazer et al., 2024) to estimate the wavelength of maximum absorbance (λmax) for five full-length LWS sequences from the caecilian species included in our study incorporating data available on GenBank (Rhinatrema bivittatum, XM_029609805.1; Microcaecilia unicolor, XM_030192059.1; Geotrypetes seraphini, XM_033922369.1; Ichthyophis bannanicus, GCA_033557465.1 hereafter I. kohtaoensis following Nishikawa et al., 2021) and our C. orientalis eye transcriptome. Additionally, we estimated absorbance values for the RH1 sequences of the same species listed above to validate the accuracy of the machine-learning model, since experimental absorbance estimates for this opsin have been reported for G. seraphini and R. bivittatum (Mohun et al., 2010). We expected the estimated λmax values for RH1 to fall within the typical range for this opsin (~500 nm; W. L. Davies, 2011) and to closely approximate experimental MSP data. Furthermore, we increased our phylogenetic sampling by including LWS and RH1 sequences from other relevant amphibian and vertebrate species. For this purpose, we explored three types of sequence archives: NCBI sequence references, unannotated transcriptomes, and published genomes. The methods used to retrieve LWS from each of these datasets are described in more detail in the Supplementary Material section 2.3.3. Our final dataset included 19 species of Anura and 8 species of Caudata (see Supplementary Table 3 for details). The nucleotide matrices were manually inspected to remove any ambiguities and the sequences were translated to amino acids for downstream analysis (see Supplementary Data 2).
Point estimations of λmax for LWS and RH1 were predicted using the vertebrate model trained with the VPOD_vert_het_1.0 dataset available in the Visual Physiology Opsin Database (https://github.com/VisualPhysiologyDB/visual-physiology-opsin-db). This model demonstrated the best predictive ability, with the highest 10-fold cross-validation R² (0.968) and the lowest mean absolute error (6.56 nm) among all the models compared in the machine learning analysis (Frazer et al., 2024). The training dataset included 721 vertebrate opsins (UVS, SWS, MWS, LWS, RH1, RH2) spanning wild-type, mutant, and chimeric sequences with λmax values ranging from 350 to 611 nm (Frazer et al., 2024).
Additionally, a literature search was conducted to identify available microspectrophotometry (MSP) data for the species included in our machine-learning analysis. In some instances, when values were not available for a particular species, we used data from a closely related species within the same genus (Supplementary Table 4). We then performed a direct comparison between the MSP data and the machine-learning estimations using a linear correlation analysis (Pearson’s correlation) using the lm and cor test functions in Rstudio v2024.9.1.394 and calculated the mean absolute error to quantify the discrepancy between the absorbance values obtained from the two methodologies.
Search for presence and expression of rod and cone phototransduction cascade genes: The presence of LWS in caecilians contrasts with the lack of morphological evidence for cone photoreceptors, which are the typical cellular structure for LWS expression. We hypothesized that if LWS is expressed and functional in cones, intact cone phototransduction genes should also be present and expressed, and thus detectable in both the genomes and transcriptomes of caecilians. Alternatively, if caecilian retinas contain only rod photoreceptors, as previously reported, and LWS is potentially co-expressed with RH1 in rods, we would expect to find only rod-associated phototransduction genes, with cone-specific genes either pseudogenized, absent, or not expressed in the eye transcriptome. To test this, we searched for rod and cone phototransduction genes in NCBI reference databases and publicly available genomes of three anuran species: Xenopus tropicalis (GCF_901001135.1), Nanorana parkeri (GCF_000935625.1), and Pyxicephalus adspersus (GCA_032062135.1); two salamanders: Ambystoma mexicanus (GCA_040938575.1) and Pleurodeles waltl (GCA_031143425.1); and three caecilians: Microcaecilia unicolor (GCF_901765095.1), Rhinatrema bivittatum (GCF_901001135.1), and Geotrypetes seraphini (GCF_902459505.1). Additionally, we searched for these genes in our newly generated eye transcriptome from Caecilia orientalis (PRJNA1181370) (Supplementary Table 3). The canonical visual phototransduction cascade in vertebrates involves three main components: (1) the heterotrimeric G-protein transducins (in rods: GNAT1, GNB1, GNGT1; in cones: GNAT2, GNB3, GNGT2); (2) the phosphodiesterase enzymes (in rods: PDE6A, PDE6B, PDE6G; in cones: PDE6C, PDE6H), and (3) the cyclic nucleotide-gated channels (in rods: CNGA1, CNGB1; in cones: CNGA3, CNGB3) (Lamb, 2013).
The identified sequences of our genes of interest were downloaded and used as reference queries to search our de novo–assembled transcriptome data, which had been formatted as a custom BLAST database using the default parameters for makeblastdb (v2.16.0+). All the obtained sequences were translated into amino acids and aligned using MAFFT v7.490 (Katoh & Standley, 2013), and the alignments were manually inspected in Geneious Prime 2022.1.1 (https://www.geneious.com) for the presence of premature stop codons. Finally, following the methodology described in section 2.2.1, we used the previously generated Salmon index of the eye transcriptome to quantify transcript abundances of visual phototransduction genes and assess whether their expression levels mirror those of the opsin genes. When multiple isoforms were reported for a given gene, we summed the expression levels of all isoforms. Although some isoforms may result from assembly artifacts, we opted to include all of them to avoid making an arbitrary choice about which isoform to retain.
Selection analyses
The presence of intact LWS genes without nonsense and frameshift mutations in five phylogenetically diverse caecilian species supports the idea that LWS is an ancestrally retained feature despite a long fossorial history. Thus, we hypothesized that LWS is under evolutionary constraint, similar to RH1, indicating a physiological role similar to its ancestral function. To test this, we conducted a series of selection analysis on LWS and RH1 in Gymnophiona and other amphibian species using the sequence data described in section 2.2.3, including transcriptomic data from C. orientalis and LWS fragments obtained via Sanger sequencing. In total, our sampling included 13 caecilian species spanning eight families (Supplementary Table 3). An updated and unrooted version of the phylogenetic tree presented by (Jetz & Pyron, 2018) served as a backbone for conducting selection analysis in the codeml program of PAML v4 (Yang, 2007).
Sequences for LWS and RH1 were aligned using MAFFT (Katoh & Standley, 2013) followed by manual editing to remove terminal stop codons in Mesquite v3.6 (Maddison & Maddison, 2015). For LWS, we ran analyses on two types of datasets, a complete dataset including the full coding sequence of the gene with smaller sampling for Gymnophiona (368 aa; Anura = 19 spp, Caudata = 8 spp, Gymnophiona = 5 spp) and an incomplete dataset with more species for Gymnophiona but fewer sites (132 aa; Anura = 19 spp; Caudata = 8 spp, Gymnophiona = 13 spp). For RH1, we only ran analyses using the complete sequences (354 aa; Anura = 18 spp, Caudata = 6 spp, Gymnophiona = 5 spp).
We initially conducted a broad phylogeny-wide analysis to assess the strength of purifying selection on LWS and RH1. We used the M0 model, which assumes a single ω (dN/dS) ratio across all sites and lineages. To represent the null hypothesis of neutral evolution, we fixed ω at 1 (fix_omega = 1; omega = 1), thereby constraining the model to assume no selection. To test our hypothesis that LWS has been maintained under purifying selection, we ran an alternative model in which ω was estimated from the data (fix_omega = 0; omega = 0.5). We then compared the likelihoods of the two models using a likelihood ratio test (LRT).
We also inferred selection patterns and tested for site-specific positive selection for each gene comparing the support for different models (M0, M1a, M2a, M2a_rel, M3, M7, M8a, and M8). We used the Bayes Empirical Bayes (BEB) approach to identify individual sites with a posterior probability >90% of being in the positively selected class of sites. Additionally, we compared the fit of clade models C and D (CmC vs. CmD) (Bielawski & Yang, 2004) to M3 and M2a_rel to investigate long-term shifts in selection pressures associated with the evolution of fossoriality in caecilians (Weadick & Chang, 2012). Both clade models assume three site classes, with the third class modeling divergent selection among clades; however, unlike CmC, CmD does not constrain the ω estimates for any site class. The best model was identified using likelihood ratio tests (LRTs) with a χ2 distribution.
Additionally, we conducted a series of analyses in HYPHY, implemented on the Datamonkey web server (Delport et al., 2010). FEL (Fixed Effect Likelihood; Kosakovsky Pond & Frost, 2005) and FUBAR (Fast, Unconstrained Bayesian AppRoximation; Murrell et al., 2013) were employed to estimate positive or purifying selection across sites by comparing single estimates of dN and dS values for each site. FUBAR uses a hierarchical Bayesian method rather than maximum likelihood, making it more sensitive than FEL to weaker signatures of positive selection (i.e., when ω is close to 1) (Murrell et al., 2013). We used BUSTED (Branch-site Unrestricted Statistical Test for Episodic Diversification; Murrell et al., 2013) to identify episodic positive selection at a fraction of sites. BUSTED allows for branch-to-branch variation across the entire phylogeny. Finally, we ran RELAX (Wertheim et al., 2015) to detect relaxation of selective constraint. This methodology requires an a priori specification (e.g., partition) of a subset of branches into foreground and background. In our case, the foreground comprised the crown Gymnophiona and its stem branch, while anurans and salamanders were designated as the background. RELAX estimates a selection intensity parameter (k), where k < 1 indicates relaxed selection and k > 1 indicates intensified selection, based on shifts in ω values between test and reference lineages.
We report results from the selection analyses using amino acid site numbers based on bovine rhodopsin (NP_001014890.1; alignments with bovine rhodopsin are available in the Supplementary Data 3). Selection results are reported by referring to the location of each amino acid according to the 3D structure of opsins, which encompasses seven transmembrane domains (TMD I–VII), three extracellular domains (ECD I–III), and the amino- and carboxyl- termini (N and C) (Palczewski et al., 2000). All input and output files for PAML and HYPHY selection analyses are available in Supplementary Data 4.
Finally, we examined LWS amino acid alignments at key spectral tuning sites to characterize the residue identity and variation at positions previously shown to influence the absorption properties of LWS and RH1 opsins in vertebrates (Asenjo et al., 1994; Yokoyama, 2002a; Yokoyama & Radlwimmer, 2001a). We then compared the amino acid variants at these sites with those reported for amphibians in the recent comprehensive analysis by (Schott et al., 2024).
创建时间:
2025-09-19



