BASF-AI/CoconutRetrieval
收藏Hugging Face2024-10-03 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/BASF-AI/CoconutRetrieval
下载链接
链接失效反馈官方服务:
资源简介:
该检索数据集包含分子式及其对应的SMILES字符串(包括异构体和规范形式)的配对,其中分子式作为查询,SMILES字符串作为文档。数据集来源于CoconutDB,旨在支持化学信息检索和分子式分析的研究,特别是在给定分子式的情况下找到正确的SMILES字符串。该数据集为化学信息学、自然语言处理(NLP)以及化学领域专用信息检索系统的开发提供了有价值的信息。
This retrieval dataset contains pairs of molecular formulas and their corresponding SMILES strings (both isomeric and canonical), with the molecular formulas serving as queries and the SMILES strings as documents. The dataset is sourced from CoconutDB and is designed to facilitate retrieval tasks where the goal is to find the correct SMILES string(s) given a molecular formula. This dataset provides valuable information for tasks involving chemical entity retrieval and molecular formula analysis, supporting research in cheminformatics, natural language processing (NLP), and the development of specialized information retrieval systems in the field of chemistry.
提供机构:
BASF-AI



