five

High-quality RNA residues: RNA2023

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/8103013
下载链接
链接失效反馈
官方服务:
资源简介:
Introduction -------------------------------------------------------------------------------- This is the RNA2023 dataset by the Richardson Lab at Duke University These are high-quality residues from high-quality, low-redundancy RNA chains in the PDB. For a similar set of quality-filtered protein residues, see the top2018 datasets at: https://doi.org/10.5281/zenodo.4626149 https://doi.org/10.5281/zenodo.5115232   Corresponding authors -------------------------------------------------------------------------------- dcrjsr at kinemage.biochem.duke.edu christopher.sci.williams at gmail.com Usage recommendations -------------------------------------------------------------------------------- RNA residues that fail the filtering criteria described below have been removed from the files.  As a result, these files can be considered pre-filtered and will return only results for residues of good model quality with supporting experimental data. Files already contain hydrogens added by Reduce in the context of the original full models. Two datasets are provided.  The standard dataset is rna2023_pruned.  We recommend this version as the default.  The RNA backbone conformational space is highly diverse, and some real conformations fall below the statistical threshold for recognition as suites.  Therefore we do not recommend excluding suite outliers from the dataset except in specialty cases.  We also provide a rna2023_nosuiteout dataset.  In this case, no residues having "!!" outlier suite identifications are permitted.  This set may be useful in specialist cases where suite outliers are undesireable or where losing some real conformations is an acceptable sacrifice for maximal filtering. Each dataset also has a mmCIF version. Note: Chains are named based on author chain ids, except for 8b0x, chain a.  To avoid conflicts with 8b0x chain A in file systems that do not support case-sensitive file names, 8b0x chain a has been renamed to chain AB, matching its PDB/mmCIF designation. Additional files -------------------------------------------------------------------------------- rna2023_pdbmetadata.csv contains information on release date, resolution, title, authors, etc for each source pdb. rna2023_chain_list contains a list of all included chains, plus statistics on the number residues from the original chain passed the quality filters. rna2023_suitename_table.csv and rna2023_suitename_table_nosuiteout.csv contain suitename identifications of rotameric RNA backbone conformations (1a, 1c, 2u, 6d, etc) precomputed for convenience. Filtering criteria: Chain level -------------------------------------------------------------------------------- The chain list was derived from http://rna.bgsu.edu/rna3dhub/nrlist, version 3.150 as of 2020/10/28, with a 1.9Å resolution cutoff. We added 6ugg chain A and two recent EM ribosome structures: 8a3d and 8b0x After residue-level filtering, chains with no complete suites were removed. Filtering criteria: Residue level -------------------------------------------------------------------------------- Even excellent structures usually contain some poorly-resolved regions.  Residue-level filtering helps avoid including these regions in otherwise high-quality data Residues are required to meet the following validation quality contain: No sugar pucker outliers No steric overlaps or "clashes", as per Probe >= 0.5Å No covalent bond or angle geometry outliers Optionally, no !! suite outliers Residues from xray structures are required for meet the following fit-to-map criteria: Average of worst 2 atoms' 2Fo-Fc map values >= 1.2 Average of worst 2 atoms' RSCC scores >= 0.7 No atoms modeled at partial occupancy Residues from em structures are required for meet the following fit-to-map criteria: RSCC >= 0.7 Residue inclusion fraction = 1.0 or >= 0.95, depending on structure No atoms modeled at partial occupancy Filtering is documented in each pruned file. See USER  DOC lines in .pdb and data_rna2023_dataset loops in .cif Version history -------------------------------------------------------------------------------- Version 1.0 Jun 30, 2023 Initial version
创建时间:
2023-07-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作