BASF-AI/CoconutSMILES2FormulaPC
收藏Hugging Face2024-10-04 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/BASF-AI/CoconutSMILES2FormulaPC
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
size_categories:
- 1M<n<10M
task_categories:
- text-classification
pretty_name: CoconutDB SMILES to Formula Pair Classification
dataset_info:
features:
- name: formula
dtype: string
- name: smiles
dtype: string
- name: label
dtype: int64
splits:
- name: test
num_bytes: 457537
num_examples: 4000
download_size: 210648
dataset_size: 457537
configs:
- config_name: default
data_files:
- split: test
path: data/test-*
tags:
- chemistry
- coconutdb
- SMILES
- chemteb
---
# CoconutDB SMILES to Formula Pair Classification
This dataset contains pairs of SMILES strings (both isomeric and canonical) and their corresponding molecular formulas, with labels indicating whether they refer to the same chemical entity. A label of 1 means the SMILES string and the molecular formula correspond to the same entity, while a label of 0 indicates they do not. The dataset is sourced from [CoconutDB](https://coconut.naturalproducts.net/) and provides valuable information for tasks involving chemical entity matching and molecular formula analysis.
提供机构:
BASF-AI



