five

solvechemistry/catecholbenchmark

收藏
Hugging Face2025-12-05 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/solvechemistry/catecholbenchmark
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit language: - en tags: - che size_categories: - 1K<n<10K --- # Summary Catechol dataset for solvent selection and machine learning. **NeurIPS Paper (Datasets and Benchmarks Track): [The Catechol Benchmark: Time-series Solvent Selection Data for Few-shot Machine Learning](https://arxiv.org/abs/2506.07619).** # Data files ## Main data files - **catechol_full_data_yields.csv**: Full data set with mixture solvents - **catechol_single_solvent_yields.csv**: Only the single-solvent data - **claisen_data_clean.csv**: Allyl Phenyl Ether data-set from an external source ## Lookup tables Tables translating solvent names - as tabulated the main data files - to various pre-computed ML-readable representations: - **acs_pca_descriptors_lookup.csv**: ACS Solvent Selection Guide's principle component analysis representation. - **drfps_lookup.csv**: Fingerprint representation created using the difference in sets containing molecular substructures to the left and right of the reaction arrow in a SMILES string - **fragprints_lookup.csv**: fragprints: A combination of molecular fingerprints, which are bit vectors indicating the presence of substructures in the molecule, and molecular fragments, which are count vectors indicating the number of times specific functional groups appear. - **spange_descriptors_lookup.csv:** Representation based on measurable properties of solvents - **smiles_lookup.csv**: SMILES strings.
提供机构:
solvechemistry
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作