five

Manually curated dataset for evaluation and complete HBDB snapshot for relationship network

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14958490
下载链接
链接失效反馈
官方服务:
资源简介:
The hbdb2.sql serves as a snapshot of the Human Breathomics Database (HBDB). In this snapshot, all literature information was collected and organized in a MySQL database, along with all recognized biomedical terms and sentences. We proposed a novel workflow and conducted relationship scoring using this snapshot with Large Language Models. The full text of some literature was retrieved using Elsevier's Text and Data Mining (TDM) service in HBDB. Redistribution is limited to 200 characters due to the terms of use of the TDM service; therefore, this file is made available for review only. The 'eval_dataset' is a manually curated dataset that includes various terms associated with four chemicals. The folder structure is organized into four layers as follows: Term A: The target volatile organic compound (VOC), such as acetone. Category of Term B: This could be a classification like chemical or molecular function. Reference ID in HBDB: Please refer to the HBDB snapshot for the URL (DOI or URL) and PubMed ID. For example, Reference ID 15878 corresponds to PubMed ID 21871718 or this link. JSON File Attributes: term_A: The target VOC. term_B: Related term. context: Sentences category: Category of term B, matching the second layer. score: Relationship score. verified: Indicates manual curation, done twice. table: The corresponding table in the HBDB database snapshot. compound_id: Compound ID in HBDB (e.g., 28 for acetone). reference_id: Reference ID, corresponding to the third layer. paragraph: Section containing the extracted sentences. Notice from Elsevier TDM service: Some rights reserved. This work permits non-commercial use, distribution, and reproduction in any medium, provided the original author and source are credited.
创建时间:
2025-03-03
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作