five

Comparative analysis of RNA 3D structure prediction methods: towards enhanced modeling of RNA-Ligand interactions

收藏
doi.org2025-03-21 收录
下载链接:
http://doi.org/10.17632/8yg88x7rdk.3
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset accompanies the publication "Comparative Analysis of RNA 3D Structure Prediction Methods: Towards Enhanced Modeling of RNA-Ligand Interactions." Our study's primary objective was to evaluate the accuracy of various methods in modeling RNA structures, with a particular focus on RNA-small molecule complexes and ligand-binding sites. We scrutinized the performance of six RNA 3D structure prediction programs—DeepFoldRNA, RhoFold, BRiQ, FARFAR2, SimRNA, and Vfold2—using RNA sequences as a standard input across all methods. Methods like FARFAR2, SimRNA, and Vfold2 were examined both with and without the inclusion of secondary structure information. Notably, BRiQ requires secondary structure restraints for its operation and was, therefore, only run under these conditions. The dataset is meticulously organized into sub-directories named according to each method. For SimRNA, FARFAR2, and Vfold2, directories without secondary structure input maintain the method's name, whereas runs that included secondary structure information are denoted with an '_ss' suffix (e.g., SimRNA_ss). For instances where secondary structures were utilized, we employed ideal secondary structures derived from the reference structure, extracted using the x3dna-dssr program v1.9.10. All secondary structures were subject to manual inspection and refinement to address any anomalies introduced by x3dna-dssr, ensuring the highest fidelity in our modeling efforts. During the final stages of preparing this publication, AlphaFold 3 was released. To benchmark the performance of all ML-based methods (AlphaFold 3, DeepFoldRNA, and RhoFold), we developed two new datasets: Blind set 1 (B1) and Blind set 2 (B2) (see Supplementary Table S2).

本数据集伴随《RNA 3D 结构预测方法比较分析:迈向RNA-配体相互作用建模的增强》一文的发表。本研究的主要目标是对建模RNA结构的多种方法的准确性进行评估,尤其关注RNA-小分子复合物及配体结合位点。我们对六种RNA 3D 结构预测程序——DeepFoldRNA、RhoFold、BRiQ、FARFAR2、SimRNA和Vfold2——的性能进行了严格审查,所有方法均以RNA序列作为标准输入。对于FARFAR2、SimRNA和Vfold2等方法,我们既考虑了包含二级结构信息的情况,也考虑了不包含的情况。值得注意的是,BRiQ的运行需要二级结构约束,因此仅在满足这些条件下进行。数据集被精心组织成按每种方法命名的子目录。对于SimRNA、FARFAR2和Vfold2,不包含二级结构输入的目录保留了方法名称,而包含二级结构信息的运行则以'_ss'后缀标识(例如,SimRNA_ss)。在利用二级结构的情况下,我们采用了从参考结构中提取的理想二级结构,使用x3dna-dssr程序v1.9.10提取。所有二级结构均经过人工检查和细化,以解决x3dna-dssr引入的任何异常,确保建模工作的最高精度。在准备该出版物的最后阶段,AlphaFold 3被发布。为了评估所有基于机器学习的方法(AlphaFold 3、DeepFoldRNA和RhoFold)的性能,我们开发了两个新的数据集:盲集1(B1)和盲集2(B2)(见补充表S2)。
提供机构:
Mendeley Data
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作