five

Taxonomy summaries of the metagenomics data sequenced as part of the investigation into the Panzi outbreak.

收藏
DataCite Commons2025-11-05 更新2026-05-03 收录
下载链接:
https://open.flinders.edu.au/articles/dataset/Taxonomy_summaries_of_the_metagenomics_data_sequenced_as_part_of_the_investigation_into_the_Panzi_outbreak_/29879330/2
下载链接
链接失效反馈
官方服务:
资源简介:
These tables contain the taxonomic summaries of the sequences generated as part of the investigation into the disease outbreak in Panzi in 2024. Tony Wawina-Bokalanga and colleagues generated the sequences, and the taxonomic summaries were generated by comparing those sequences to the UniRef50 database using MMSeqs2 using the atavide-lite pipeline.For each taxonomic level (kingdom, phylum, class, order, family, genus, species) there are two files, <code>raw</code> for the raw data which is the number of reads that mapped at that taxonomic level, and <code>norm</code> for the normalised number of reads that mapped, which is given by the number of reads that mapped at that level, dividided by the total number of reads that mapped, multiplied by 1,000,000.The tables are tab-separated values that can be easily read by R, Python, Pandas, Excel, OpenOffice, or any other software.<br>

本数据集收录了2024年潘齐(Panzi)地区疾病暴发调查中生成的序列的分类学汇总结果。该序列由Tony Wawina-Bokalanga及其研究团队生成,分类学汇总结果则通过atavide-lite流程,使用MMSeqs2工具将上述序列与UniRef50数据库进行比对后得到。针对界、门、纲、目、科、属、种这7个分类学层级,每个层级均对应两类文件:分别为`raw`与`norm`。其中`raw`文件存储原始数据,即比对至该分类学层级的测序读段数量;`norm`文件存储标准化后的比对读段数,其计算方法为:将该分类学层级的比对读段数除以总比对读段数后,再乘以1,000,000。本数据集采用制表符分隔值格式存储,可便捷地通过R、Python、Pandas、Excel、OpenOffice或其他同类软件读取。
提供机构:
Flinders University
创建时间:
2025-08-11
二维码
社区交流群
二维码
科研交流群
商业服务