Rickettsia PhyloFlash and Kraken analysis of arthropod whole genome projects from Sequence Read Archive
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://figshare.com/articles/dataset/Rickettsia_PhyloFlash_and_Kraken_analysis_of_arthropod_whole_genome_projects_from_Sequence_Read_Archive/12801140
下载链接
链接失效反馈官方服务:
资源简介:
Insights into the balance of Rickettsia groups
within arthropod symbioses were obtained through searching for Rickettsia
presence in Illumina datasets associated with arthropod whole genome sequence
(WGS) projects in the SRA (60,409 records as of the 20th May 2019). To reduce
the bias from over-represented laboratory model species (e.g. Drosophila
spp., Anopheles spp.) a single dataset per species was examined, and
where multiple data sets existed for a species, that with the largest read
count was retained.
This data set was screened with phyloFlash which finds, extracts
and identifies 16S rRNA sequences. Reconstructed full 16S
rRNA sequences affiliated to Rickettsia were extracted and
compared to sequences derived from the targeted screen phylogenetically (see
sections above) to assess group representation within the genus. The microbial
composition of all SRA datasets that did not result in a reconstructed Rickettsia
16S rRNA with phyloFlash were re-evaluated using Kraken2, a k-mer based
taxonomic classifier for short DNA sequences. A cut-off of at least 40k reads
assigned to Rickettsia taxa was applied for reporting potential
infections (theoretical genome coverage of ~ 1 – 4X assuming an average genome
size of ~1.5Mb). As Rickettsia-infected protists have previously been reported, phyloFlash was also used to identify reads aligned to protists to account for potential positives attributed to protists as opposed to insects.
创建时间:
2020-08-13



