A benchmark of computational CRISPR-Cas9 guide design methods
收藏Figshare2019-08-29 更新2026-04-29 收录
下载链接:
https://figshare.com/articles/dataset/A_benchmark_of_computational_CRISPR-Cas9_guide_design_methods/9750209
下载链接
链接失效反馈官方服务:
资源简介:
The popularity of CRISPR-based gene editing has resulted in an abundance of tools to design CRISPR-Cas9 guides. This is also driven by the fact that designing highly specific and efficient guides is a crucial, but not trivial, task in using CRISPR for gene editing. Here, we thoroughly analyse the performance of 18 design tools. They are evaluated based on runtime performance, compute requirements, and guides generated. To achieve this, we implemented a method for auditing system resources while a given tool executes, and tested each tool on datasets of increasing size, derived from the mouse genome. We found that only five tools had a computational performance that would allow them to analyse an entire genome in a reasonable time, and without exhausting computing resources. There was wide variation in the guides identified, with some tools reporting every possible guide while others filtered for predicted efficiency. Some tools also failed to exclude guides that would target multiple positions in the genome. We also considered two collections with over a thousand guides each, for which experimental data is available. There is a lot of variation in performance between the datasets, but the relative order of the tools is partially conserved. Importantly, the most striking result is a lack of consensus between the tools. Our results show that CRISPR-Cas9 guide design tools need further work in order to achieve rapid whole-genome analysis and that improvements in guide design will likely require combining multiple approaches.
基于CRISPR的基因编辑技术(CRISPR-based gene editing)的广泛应用,催生了大量用于设计CRISPR-Cas9向导序列(CRISPR-Cas9 guides)的工具。这一现象的背后,同样源于一个核心需求:在CRISPR基因编辑应用中,设计出高特异性与高效率的向导序列是一项至关重要却又并非易事的任务。本研究对18款此类设计工具的综合性能展开了系统性分析,从运行性能、计算资源需求以及生成的向导序列三个维度对各工具进行评测。为达成上述目标,我们开发了一套可在工具运行过程中审计系统资源占用情况的方法,并基于源自小鼠基因组的尺寸逐步递增的数据集对每款工具进行测试。研究结果显示,仅有5款工具的计算性能能够支持其在合理时间内完成全基因组分析,且不会耗尽计算资源。各工具生成的识别向导序列差异显著:部分工具会输出所有潜在的向导序列,而另一些则会根据预测的编辑效率进行筛选。部分工具还未能排除会靶向基因组多个位点的向导序列。此外,我们还选取了两组各包含千余条向导序列的数据集进行评测,这两组数据集均配有对应的实验验证数据。不同数据集下的工具性能表现存在较大差异,但各工具的性能相对排名仅部分保持稳定。尤为值得关注的是,本研究最显著的发现是各工具的预测结果存在显著分歧。综上,本研究结果表明,现有的CRISPR-Cas9向导序列设计工具仍需进一步优化,方能实现快速全基因组分析;而向导序列设计性能的提升,或许需要融合多种不同的技术路径。
创建时间:
2019-08-29



