Data from: Transcriptome resources for the frogs Lithobates clamitans and Pseudacris regilla, emphasizing antimicrobial peptides and conserved loci for phylogenetics

Mendeley Data2024-06-25 更新2024-06-27 收录

下载链接：

https://datadryad.org/stash/dataset/doi:10.5061/dryad.j6676

下载链接

链接失效反馈

官方服务：

资源简介：

Annotation spreadsheet for transcriptome assembly of Lithobates clamitansAn Excel spreadsheet of contig sequence similarity for the Lithobates clamitans assembly. BLASTX hits to Xenopus tropicalis, HMMER matches to the Pfam-A database (for predicted ORFs), and other annotation information is given. The column headings are as follows: Contig: contig name Length (bp): length in base pairs %GC content: percent of contig sequence that is C or G Longest ORF (min 50 amino acids): the longest ORF identified on the contig that was at least 50 amino-acids long Pfam (E-value threshold 0.1): A colon-delimited list of matched Pfam accessions, E-values, and domain descriptions Xenopus.tropicalis BLASTX E-value: E-value of the best BlastX match to X. tropicalis proteins. bit: bit score of this match id%: percentage identity (protein sequence) of this match to the query contig description: descriptive text in the fasta header of the best match RefSeq ID: associated RefSeq ID for best match Entrez gene ID: associated Entrez gene ID for best match UniProt ID: associated UniProt ID for best matchLithobates.clamitans.annotation.xlsAnnotation spreadsheet for transcriptome assembly of Pseudacris regillaA spreadsheet of contig sequence similarity for the Pseudacris regilla assembly. BLASTX hits to Xenopus tropicalis, HMMER matches to the Pfam-A database (for predicted ORFs), and other annotation information is given. Column headings are:Contig: contig name Length (bp): length in base pairs %GC content: percent of contig sequence that is C or G Longest ORF (min 50 amino acids): the longest ORF identified on the contig that was at least 50 amino-acids long Pfam (E-value threshold 0.1): A colon-delimited list of matched Pfam accessions, E-values, and domain descriptions Xenopus.tropicalis BLASTX E-value: E-value of the best BlastX match to X. tropicalis proteins. bit: bit score of this match id%: percentage identity (protein sequence) of this match to the query contig description: descriptive text in the fasta header of the best match RefSeq ID: associated RefSeq ID for best match Entrez gene ID: associated Entrez gene ID for best match UniProt ID: associated UniProt ID for best matchPseudacris.regilla.annotation.xlsantimicrobial-peptide-clustersA FASTA-formatted file containing aligned representative sequences for each cluster of antimicrobial peptide reads and/or contigs. The sequence headers correspond to the phylogeny in Figure 1.antimicrobial-peptide-cluster-readsThe raw reads that mapped to each sequence cluster given in antimicrobial-peptide-clusters.fasta. This file provides the raw data underlying our clustering of antimicrobial peptide transcripts.Rana-assembly3-way-orthologsA simple list of contigs from each of three transcriptome assemblies that were reciprocal-best TBLASTX matches. Each row represents a triplet of putatively orthologous sequences that may be useful for comparative genomics, primer design, etc.3-way-aligned-orthologous-segmentsThis FASTA-formatted file is a series of 56 sequence alignments. Each alignment contains one sequence from each frog transcriptome studied. The sequences were aligned at the protein level using MUSCLE (Edgar 2004) and then converted to nucleotide alignments. Annotation information for each set of contigs can be found in the two annotation.xls spreadsheets. This file provides the expected amplicons from each reference sequence for the primers listed in conserved-primer-candidates-for-orthologous-segments.xls, for the purposes of guiding the selection of sequences that may be useful for a given population-genetic or phylogenetic study.conserved-primer-candidates-for-orthologous-segmentsA spreadsheet containing sets of forward and reverse PCR primers in successive rows, for each of the 56 segments in 3-way-aligned-orthologous-segments.fas. The primers were predicted with BatchPrimer3 (You et al. 2008) and the principal output such as expected product size and Tm is included. We also include the estimated dN/dS ratio for each pairwise comparison of the three frog transcriptomes, as an index of the overall conservation of each set of predicted cDNAs. The primer sets have not been systematically evaluated on either cDNA or genomic DNA, and may or may not bridge intronic sequence.Summary-figure-of-nucleotide-distance-conserved-regionsThis figure summarizes the nucleotide distances from the spreadsheet "conserved-primer-candidates-for-orthologous-segments.xls" in order to better guide the selection of loci for molecular phylogenetics. Loci with greater nucleotide distances may be more informative for analysis of closely related species.Conserved signal peptides of AMPsSequence logos of the highly conserved signal peptide and acidic propiece region for each species, based on aligned cluster sequences. Standard amino-acid symbols are used, with dark font representing more acidic residues. The predicted cleavage site is C-terminal of the conserved cysteine residue. Position 24 of the P. regilla alignment is blank because the majority of sequences had a gap at this alignment position.

创建时间：

2023-06-28