Molecular geometries and energies from quantum mechanical calculations and small molecule force field evaluations.
收藏Zenodo2020-11-06 更新2026-05-25 收录
下载链接:
https://zenodo.org/record/4247859
下载链接
链接失效反馈官方服务:
资源简介:
Force fields are used in a wide variety of contexts for classical molecular simulation, including studies on protein-ligand binding, membrane permeation, and thermophysical property prediction.<br> The quality of these studies relies on the quality of the force fields used to represent the systems. <br> Focusing on small molecules of fewer than 50 heavy atoms, this data compares nine force fields: GAFF, GAFF2, MMFF94, MMFF94S, OPLS3e, SMIRNOFF99Frosst, and the Open Force Field Parsley, versions 1.0, 1.1, and 1.2.<br> On a dataset comprising 22,675 molecular structures of 3,271 molecules, we analyzed force field-optimized geometries and conformer energies compared to reference quantum mechanical (QM) data.<br> <br> The data was created using scripts of the benchmarkff github repository. A corresponding manuscript is submitted, a preprint is available on ChemRxiv:<br> Lim, Victoria T.; Hahn, David F.; Tresadern, Gary; Bayly, Christopher I.; Mobley, David (2020): Benchmark Assessment of Molecular Geometries and Energies from Small Molecule Force Fields. ChemRxiv. Preprint Read below or the file README.md for further information and description of the content: <pre><code class="language-markdown"># README Version: 04 Nov 2020 For Python scripts that are NOT found in these directories, please check the [BenchmarkFF Github repo](https://github.com/MobleyLab/benchmarkff/tree/master/tools). ## Procedure 1. Prep OPLS3e file for analysis: standardize format by OpenEye in case of differences and convert from kJ/mol to kcal/mol. ``` cd prep python convert_extension.py -i opls3e_minimized.sd -o opls3e.sdf ``` 2. Remove mols that couldn't parameterize by ALL FFs. ``` python get_by_tag.py -i opls3e.sdf -s "SMILES QCArchive" -list trim3.txt -o trim3_full_opls3e.sdf ``` 3. Run analysis. ``` conda activate parsley # calc ddE, RMSD, and TFD distributions python compare_ffs.py -i match.in -t 'SMILES QCArchive' --plot > metrics.out # match_minima, only in 01_analysis_all and 02_analysis_all_smaller_cutoff python match_minima.py -i match.in --plot --cutoff 1.0 --readpickle # look at specific subsets, only in 01_analysis_all python color_by_moiety.py -i match.in -p metrics.pickle -s N-N.dat azetidine.dat octahydrotetracene.dat -o scatter_tfd_3_ # look at outliers,only in 01_analysis_all and 02_analysis_all_smaller_cutoff python tailed_parameters.py -i refdata_trim_overlap_full_openff_unconstrained-1.2.0.sdf -f <offxml file> --metric 'TFD' --cutoff 0.12 --tag "TFD to trim_overlap_full_qcarchive.sdf" --tag_smiles "SMILES QCArchive" > output_tfd.dat ``` ## Brief description of contents * High level: ``` . ├── 00_prep │ ├── convert_extension.py │ ├── opls3e_minimized.sd OPLS3e minimized structures from Schrodinger Maestro │ ├── opls3e.sdf standardized through OpenEye tools │ ├── opt_openff*.sdf OpenFF minimized conformations ├── 01_analysis_all compare all ffs (qm, GAFF(2), MMFF94(S), Smirnoff, OpenFF-X.X, OPLS3e) ├── 02_analysis_all_smaller_cutoff compare all ffs (qm, GAFF(2), MMFF94(S), Smirnoff, OpenFF-X.X, OPLS3e) with a smaller cutoff of .3 for match_minima ├── 03_analysis_latest_ffs compare only the latest versions of ffs (qm, GAFF2, MMFF94S, OpenFF-1.2, OPLS3e) ├── 04_analysis_openff_only compare only OpenFF ffs (qm, Smirnoff, OpenFF-X.X) └── README.md ``` * Inside an output directory: ``` YY_analysis_* various output files of above mentioned scripts, some are listed and described below: ├── bar*.png parameter coverage bar plots ├── ddE.dat relative energies data ├── fig_density_*.png scatter plots of ddE vs (RMSD or TFD) for each force field ├── match.in input file for compare_ffs.py ├── metrics.out output file for compare_ffs.py ├── metrics.pickle pickle file for compare_ffs.py -- you can read this into compare_ffs instead of rerunning the full analysis ├── refdata_*.sdf output SDF files with stored RMSD / TFD scores with reference to QM for each structure ├── relene_*.dat relative energies of matched conformers ├── ridge_dde.png compared energies plot ├── ridge_rmsd.svg compared rmsds plot ├── ridge_tfd.svg compared tfds plot ├── fig_scatter_*.png scatter plots of ddE vs (RMSD or TFD). these are noisy; I don't use these ├── trim3_*.sdf input SDF files for compare_ffs.py listed in match.in file ├── violin*.* violin plot showing ddE distributions ``` </code></pre>
提供机构:
Zenodo
创建时间:
2020-11-05



