five

Initial consensus sequences

收藏
DataCite Commons2022-04-14 更新2024-07-29 收录
下载链接:
https://figshare.com/articles/dataset/Initial_consensus_sequences/19596589
下载链接
链接失效反馈
官方服务:
资源简介:
we first prepared a query library by clustering the 165 RT amino acid sequences of Gypsy and Copia reference elements taken from Gypsy database version 2 (Llorens et al., 2011) at a depth of 40% identity and we selected one representative sequence per cluster (n=35). We then used the resulting RT library to search for homologous regions in target genomes with tBLASTn from ncbiblast+ package. All overlapping hits on genomes were merged and the corresponding fasta sequences within the expected size range (520-840 bp) were extracted (n=). To avoid sparing unnecessary computational time, during following steps, RT sequences from each species were clustered at a threshold of 95% identity using MMseqs2 (Steinegger and Soding, 2017) and a maximum of 50 sequences per group were selected for downward analysis. The genomic positions of RT coding regions were extended of 5 kb upstream and downstream and the corresponding sequences were extracted (n=). Extended hits were then clustered using mmseqs2 (with parameters -c 0.5 --max-seq-len 15000), and the groups containing at least 5 sequences were aligned with MAFFT (Katoh et al., 2002). A consensus sequence was then generated for each sequence alignment through the modules “msa2profile” (with parameters --match-mode 1 --match-ratio 0.5) and “profile2consensus“, resulting in 25,565 consensus sequences. To address the fraction of the consensus sequences representing LTR retrotransposons, we compared each one to a library of reference aa RT sequences from Copia, Gypsy, DIRS, endogenous retroviruses, Caulimoviridae, and LINEs using BLASTx. The consensus corresponding to LTR retrotransposons were identified from their best hit (highest bit score) against RT from Copia or Gypsy.
提供机构:
figshare
创建时间:
2022-04-14
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作