five

Rickettsia PhyloFlash and Kraken analysis of arthropod whole genome projects from Sequence Read Archive

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://figshare.com/articles/dataset/Rickettsia_PhyloFlash_and_Kraken_analysis_of_arthropod_whole_genome_projects_from_Sequence_Read_Archive/12801140
下载链接
链接失效反馈
官方服务:
资源简介:
Insights into the balance of Rickettsia groups within arthropod symbioses were obtained through searching for Rickettsia presence in Illumina datasets associated with arthropod whole genome sequence (WGS) projects in the SRA (60,409 records as of the 20th May 2019). To reduce the bias from over-represented laboratory model species (e.g. Drosophila spp., Anopheles spp.) a single dataset per species was examined, and where multiple data sets existed for a species, that with the largest read count was retained. This data set was screened with phyloFlash which finds, extracts and identifies 16S rRNA sequences. Reconstructed full 16S rRNA sequences affiliated to Rickettsia were extracted and compared to sequences derived from the targeted screen phylogenetically (see sections above) to assess group representation within the genus. The microbial composition of all SRA datasets that did not result in a reconstructed Rickettsia 16S rRNA with phyloFlash were re-evaluated using Kraken2, a k-mer based taxonomic classifier for short DNA sequences. A cut-off of at least 40k reads assigned to Rickettsia taxa was applied for reporting potential infections (theoretical genome coverage of ~ 1 – 4X assuming an average genome size of ~1.5Mb). As Rickettsia-infected protists have previously been reported, phyloFlash was also used to identify reads aligned to protists to account for potential positives attributed to protists as opposed to insects.
创建时间:
2020-08-13
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作