five

Turtle Software.

收藏
Figshare2016-01-18 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/dataset/Turtle_Identifying_frequent_k_mers_with_cache_efficient_algorithms/791579/2
下载链接
链接失效反馈
官方服务:
资源简介:
We present a novel method that balances time, space and accuracy requirements to efficiently extract frequent k-mers even for high coverage libraries and large genomes such as human. Our method is designed to minimize cache-misses in a cache-efficient manner by using a Pattern-blocked Bloom filter to remove infrequent $k$-mers from<br>consideration in combination with a novel sort-and-compact scheme, instead of a Hash, for the actual counting. While this increases theoretical complexity, the savings in cache misses reduce<br>the empirical running times. A variant can resort to a counting Bloom filter for even larger savings in memory at the expense of false negatives in addition to the false positives common to<br>all Bloom filter based approaches. A comparison to the state-of-the-art shows reduced memory requirements and running times.
提供机构:
Rajat Roy; Debashish Bhattacharya
创建时间:
2013-09-07
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作