five

Supplementary_data.tar.gz

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Supplementary_data_tar_gz/29646311
下载链接
链接失效反馈
官方服务:
资源简介:
The supplementary material contains the following data : (i) Data corresponding to the Results section ‘Novel ECVs further extend the ancestral host range of the Caulimoviridae’: RT_aa.fa: This fasta file comprises 369 amino acid sequences corresponding to the RT domain, including 261 newly detected ECRTs (headers: number_TagPlant), 2 RTs from TSA data (headers: TSA_TagPlant), 98 RTs from public data (headers: REF_RT_virusName), and 8 RTs from Ortervirales (headers: REF_RT_OUTGP)RT_aa_network_guidance_098.aln: Alignment file of RT_aa.fa performed with guidance. This alignment was used to build the phylogenetic network that guided the cutoff selection of the OTU clustering (ii) Data corresponding to the Results section ‘Phylogenetic analysis’: RT_RH_nt_Caulimoviridae.fa: This fasta file comprises 143 nucleotide sequences corresponding to the RT-RH domain, including 73 reference sequences, 69 novel sequences (headers: OTU_number|sequence_id), and the outgroup Ty3.RT_RH_nt_Caulimoviridae.aln: Alignment with Mafft of the sequences from RT_RH_nt_Caulimoviridae.faCaulimoviridae_Bayesian_phylogeny.nexus and Caulimoviridae_MaximumLikelihood_phylogeny.nexus: The two phylogenetic trees built with Bayesian and Maximum likelihood methods, respectively, from RT_RH_nt_Caulimoviridae.aln. (iii) Data corresponding to the Results section ‘Characterization of Caulimovirid Clade C’: RT_aa_Ortervirales.fa: This fasta file comprises 52 amino acid sequences corresponding to the RT domains, including 28 Ortervirales sequences from the Gypsy database (Llorens et al. 2011) belonging to the families Belpaoviridae, Pseudoviridae, Retroviridae, and Metaviridae, and 24 Caulimoviridae sequences.RT_aa_Ortervirales.aln: An alignment file built with Mafft from RT_aa_Ortervirales.fa.RT_aa_Ortervirales.nwk: A phylogenetic tree built with maximum likelihood method from RT_aa_Ortervirales.aln.30K_MP.fa: This fasta file comprises 332 amino acid sequences corresponding to the movement protein domains, including 286 sequences from Butkovic et al. (2024), representing the following plant viral families: Alphaflexiviridae, Aspiviridae, Betaflexiviridae, Bromoviridae, Botourmiaviridae, Caulimoviridae, Fimoviridae, Geminiviridae, Kitaviridae, Mayoviridae, Phenuiviridae, Rhabdoviridae, Secoviridae, Tospoviridae, and Virgaviridae, as well as 46 Caulimoviridae sequences identified using Caulifinder.30K_MP_trimed05.aln: An alignment file built with Mafft from 30K_MP.fa.30K_MP_maximum_likelihood: A phylogenetic tree built with the maximum likelihood method from 30K_MP_trimed05.aln.WolV1.docx: This file contains the sequences of the genome and the 2 ORFs of Wollendovirus1. (iv) Data corresponding to the Results section ‘Evidence of patterns of cospeciation’: Agathis_dammara_OTU19_RT_contig.fa: This file contains the contig built from the DNA short-read sequences of Agathis dammara. This contig encodes a caulimovirid RT domain. Licence: CC BY-NC 4.0 (NON COMMERCIAL USE ONLY)
创建时间:
2025-07-25
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作