Minimizer collision statistics (BLEND: A Fast, Memory-Efficient, and Accurate Mechanism to Find Fuzzy Seed Matches in Genome Analysis)
收藏NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://zenodo.org/record/7317895
下载链接
链接失效反馈官方服务:
资源简介:
This dataset includes the statistics for the minimizers that generate the same hash value (i.e., collisions). The hash values are generated using a low-collision hash function and the SimHash technique in BLEND.
*collision_stats.txt files include the overall collision statistics for a tool and configuration of the tool (i.e., the number of colliding minimizer pairs with a certain edit distance and their ratio to all number of collisions). For example blend_n3_collision_stats.txt shows the statistics for BLEND where the number of neighbors is set to 3 when running BLEND.
_sim.csv files include all minimizer pairs with the same hash value and the edit distance between them.
创建时间:
2022-11-14



