hheiden/USPTO_OCSR_benchmark
收藏Hugging Face2025-10-10 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/hheiden/USPTO_OCSR_benchmark
下载链接
链接失效反馈官方服务:
资源简介:
USPTO OCSR Benchmark 数据集是一个针对光学化学结构识别(OCSR)的大型验证和基准数据集,包含5719个化学结构图像。这些数据来源于美国专利局(USPTO)的文件,并经过精心策划以提供准确的基准MOL文件。Hugging Face版本的数据集进一步丰富了数据,为每个分子提供了预计算的SMILES、InChI和SELFIES字符串。
The USPTO OCSR Benchmark is a large validation and benchmark set for Optical Chemical Structure Recognition (OCSR), containing 5,719 chemical structure images sourced from the US Patent Office documents. The dataset has been curated to provide accurate ground truth MOL files and includes pre-computed SMILES, InChI, and SELFIES strings for each molecule.
提供机构:
hheiden



