Cyanobacteriochromes from Gloeobacterales provide new insight into the diversification of cyanobacterial photoreceptors

NIAID Data Ecosystem2026-05-01 收录

下载链接：

http://datadryad.org/dataset/doi%253A10.25338%252FB8JM11

下载链接

链接失效反馈

官方服务：

资源简介：

The phytochrome superfamily comprises three groups of photoreceptors sharing a conserved GAF (cGMP-specific phosphodiesterases, cyanobacterial adenylate cyclases, and formate hydrogen lyase transcription activator FhlA) domain that uses a covalently attached linear tetrapyrrole (bilin) chromophore to sense light. Knotted red/far-red phytochromes are widespread in both bacteria and eukaryotes, but cyanobacteria also contain knotless red/far-red phytochromes and cyanobacteriochromes (CBCRs). Unlike phytochromes, CBCRs require only the GAF domain for bilin binding, chromophore ligation, and full, reversible photoconversion. CBCRs can sense a wide range of wavelengths (ca. 330-750 nm) and can regulate phototaxis, second messenger metabolism, and optimization of the cyanobacterial light-harvesting apparatus. However, the origin and diversification of CBCRs are not well understood. In the current work, we use the increasing availability of genomes and metagenome-assembled-genomes from early-branching cyanobacteria to identify the earliest branches in CBCR evolution. Our analyses also show that early-branching cyanobacteria contain more recently evolved CBCRs, implicating significant diversification of CBCRs very early in cyanobacterial evolution. Moreover, we show that early-branching CBCRs behave as integrators of light and pH, providing a potential unique function for early CBCRs that could have led to their retention and subsequent diversification. Our results thus provide new insight into the origins of these diverse cyanobacterial photoreceptors. Methods Data are deposited as a gzipped tarball containing three types of data, each in its own directory: Phylogenies, Genome Analysis, and Spectroscopic data. Deposited data are all flat text files with UNIX newlines. 1. Phylogenies are presented for the CBCR (cyanobacteriochrome) domain, the histidine kinase bi-domain, for a catenation of taxis proteins, and for a catenation of translation proteins (ribosomal proteins + elongation factor 4). In each case, the initial sequence alignment was generated using MAFFT v7.450 with the following command-line settings: --genafpair --maxiterate 16 --clustalout –reorder The resulting alignments are deposited in CLUSTAL format (indicated by the .aln extension). For phylogenetic analysis, each alignment was processed with an in-house script to remove positions having ≥5% gaps. The resulting alignments are deposited in PHYLIP format (indicated by the .phy extension) and were used to infer maximum-likelihood phylogenies in PhyML-3.1 with 100 bootstraps, using the following command-line settings: m WAG -d aa -s SPR -a e -c 4 -v e -o tlr -b 100 Support was evaluated using the transfer bootstrap expectation (TBE) as implemented in booster, and the resulting trees are deposited in Newick format with TBE as support values. 2. Genome Analysis. Cyanobacterial genomes and metagenome-assembled metagenomes were evaluated for assembly size, for the number of candidate phytochrome or CBCR open reading frames (ORFs) found in the assembly, and for the total number of candidate bilin-binding GAF domains in the assembly. Three files are deposited from this analysis, each as a tab-delimited text file. ORF_assembly_plot.txt is a spreadsheet for 2-dimensional (x-y) scatter plotting of the number of ORFs vs. the assembly size for Prochlorococcaceae, Gloeobacter spp., and other cyanobacteria. ORF_density_plot.txt is a spreadsheet for 1-dimensional plotting (e.g., box/whisker) of the ORF density for a series of assemblies belonging to cyanobacterial lineages: Gloeobacter spp., all other members of the Gloeobacterales, Thermostichales, Pseudanabaenales, Gloeomargaritales, and higher crown cyanobacteria. ORF density was calculated for each assembly as (number of candidate ORFs)/(assembly size). tandem_index_plot.txt is a spreadsheet for 1-dimensional plotting of the tandem index for the same assemblies as the ORF density and is organized into the same lineages. The tandem index was calculated for each assebmly as (number of candidate bilin-binding GAF domains)/(number of candidate ORFs). 3. Spectroscopic data. Six types of spectroscopic data are presented, each in its own subdirectory. All absorption spectra were acquired on Cary 50 or Cary 60 spectrophotometers; circular dichroism (CD) spectra were acquired on an Applied Photophysics Chirascan. Raw files were processed with an in-house script to convert to tab-delimited format and remove user metadata. The resulting files were used for analysis and figure preparation. The contents of each subdirectory are as follows: Absorption spectra pH 7.5 This set comprises a series of 19 text files, one for each protein characterized, and each filename matches the name of the protein in the manuscript. Spectra were acquired in TKKG buffer (25 mM TES-KOH pH 7.5, 100 mM KCl, 10% (v/v) glycerol). Spectra are presented for the 15Z dark-adapted state and the 15E photoproduct, along with the photochemical difference spectrum (calculated as 15Z – 15E). Normalized spectra This set has a similar series of files, each containing normalized data for samples to facilitate comparison. Files are named by the proteins that are compared in each case, with three exceptions. These three files present normalized photochemical difference spectra for various proteins at pH 6, at pH 9, and at pH 10. These files are named by the pH value instead (for example, "pH6_difference" has the difference spectra at pH 6). static pH spectra The pH response was examined for a series of proteins, each in either the 15Z or 15E state. In this experiment, 100 µl of protein in TKKG buffer was diluted into 1 ml of 0.4 M buffer at different pH values (e.g., 0.4M MES, pH 6). This subdirectory contains one file for each protein, with the filename containing the name of the protein and the configuration (for example, "AnPixJg2_15Z.txt" indicates that the protein is AnPixJg2 and it is in the 15Z state). photoconversion pH spectra Photoconversion was evaluated at different pH values for twelve of the proteins examined under "static pH spectra" (above). The resulting photochemical difference spectra at pH values of interest are deposited here, with each file named by the protein (for example, "AnPixJg2.txt" contains photochemical difference spectra for AnPixJg2 at a range of pH values). pKa plotting This set has a series of files, each of which contains titration data (pH and absorbance data as x-y pairs) for one protein. Some files contain data for multiple wavelengths and/or datasets. CD spectra This set has a series of files, each containing circular dichroism spectra for a single protein in two photostates. Each file is named by the protein. All spectra are deposited after baseline subtraction.

创建时间：

2023-10-11