Supporting data for "Meta-Prism 2.0: Enabling algorithm and web server for ultra-fast, memory-efficient, and accurate analysis among millions of microbial community samples"
收藏Mendeley Data2024-01-31 更新2024-06-29 收录
下载链接:
http://gigadb.org/dataset/102236
下载链接
链接失效反馈官方服务:
资源简介:
Microbial community samples have been accumulated at a speed faster than ever, with hundreds of thousands of samples been sequenced each year. Mining such a huge amount of multi-source heterogeneous data is becoming an increasingly difficult challenge, so efficient and accurate compare and search of samples are in urgent need: Faced with millions of samples in the data repository, traditional sample comparison and search approaches fall short in speed and accuracy. Here we proposed Meta-Prism 2.0, a microbial community sample analysis method that has pushed the time and memory efficiency to a new limit without compromising accuracy. Based on sparse data structure, time-saving instruction pipeline, and SIMD optimization, Meta-Prism 2.0 has enabled ultra-fast, memory-efficient, flexible and accurate search among millions of samples. Meta-Prism 2.0 was put to test on several datasets, with the largest containing one million samples. Results show that Meta-Prism 2.0’s 0.00001s per sample pair compare speed and 8GB memory needs for searching against one million samples have made it one of the most efficient sample analysis methods. Additionally, Meta-Prism 2.0 can achieve accuracy comparable with or better than other contemporary methods. Thirdly, Meta-Prism 2.0 can precisely identify the original biome for samples, thus enabling sample source tracking. Finally, we have provided a web server for fast search of microbial community samples online. In summary, Meta-Prism 2.0 has changed the resource-intensive sample search scheme to an effective procedure, which could be conducted by researchers every day even on a laptop, for insightful sample search, similarity analysis and knowledge discovery. Meta-Prism 2.0 can be accessed in GitHub, and the web server can be accessed here.
创建时间:
2024-01-31



