five

Prediction and diversity of tracrRNAs from type II CRISPR-Cas systems

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://figshare.com/articles/dataset/Prediction_and_diversity_of_tracrRNAs_from_type_II_CRISPR-Cas_systems/6807203
下载链接
链接失效反馈
官方服务:
资源简介:
Type II CRISPR-Cas9 systems require a small RNA called the trans-activating CRISPR RNA (tracrRNA) in order to function. The prediction of these non-coding RNAs in prokaryotic genomes is challenging because they have dissimilar structures, having short stems (3–6 bp) and non-canonical base-pairs e.g. G-A. Much of the tracrRNA is involved in base-pairing interactions with the CRISPR RNA, or itself, or in RNA-protein interactions with Cas9. Here we develop a new bioinformatic tool to predict tracrRNAs. On an experimentally verified test set the algorithm achieved a high sensitivity and specificity, and a low false discovery rate (FDR) on genome analysis. Analysis of representative RefSeq genomes (5462) detected 275 tracrRNAs from 165 genera. These tracrRNAs could be grouped into 15 clusters which were used to build covariance models. These clusters included Streptococci and Staphylococci tracrRNAs from the CRISPR-Cas9 systems which are currently used for gene editing. Compensating base changes observed in the models were consistent with the experimental structures of single guide RNAs (sgRNAs). Other clusters, for which there are not yet structures available, were predicted to form novel tracrRNA folds. These clusters included a large and divergent tracrRNA set from Bacteroidetes. These computational models contribute to the understanding of CRISPR-Cas biology, and will assist in the design of further engineered CRISPR-Cas9 systems. The tracrRNA prediction software is available through a galaxy web server.

II型CRISPR-Cas9系统(Type II CRISPR-Cas9 systems)发挥功能需要一种名为反式激活CRISPR RNA(trans-activating CRISPR RNA, tracrRNA)的小型RNA。在原核生物基因组中预测这类非编码RNA极具挑战性,原因在于它们的结构差异显著:仅具有3~6个碱基对的短茎区,且存在G-A等非经典碱基配对方式。tracrRNA的大部分序列要么与CRISPR RNA自身形成碱基配对互作,要么与Cas9蛋白发生RNA-蛋白质互作。本研究开发了一款全新的生物信息学工具以实现tracrRNA的预测。在经过实验验证的测试集上,该算法展现出了优异的灵敏度与特异性,且在全基因组分析中假发现率(false discovery rate, FDR)较低。通过对5462个代表性RefSeq基因组(RefSeq genomes)的分析,研究人员从165个属中检测到了275条tracrRNA。这些tracrRNA可被划分为15个簇,进而用于构建协方差模型(covariance models)。这些簇包含了当前用于基因编辑的CRISPR-Cas9系统来源的链球菌属(Streptococci)与葡萄球菌属(Staphylococci)tracrRNA。模型中观察到的补偿性碱基变化与单向导RNA(single guide RNAs, sgRNAs)的实验结构相符。其余尚无结构解析的簇则被预测可形成新型tracrRNA折叠结构,其中包括一类来自拟杆菌门(Bacteroidetes)的庞大且高度趋异的tracrRNA集合。这些计算模型有助于加深对CRISPR-Cas生物学的理解,并将助力后续工程化CRISPR-Cas9系统的设计。本研究开发的tracrRNA预测软件可通过Galaxy网络服务器(Galaxy web server)获取。
创建时间:
2018-08-02
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作