bing-yan/askchem
收藏Hugging Face2026-04-24 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/bing-yan/askchem
下载链接
链接失效反馈官方服务:
资源简介:
AskChem是一个结构化的化学知识索引数据集(完整版,包含摘要和全文),为化学研究提供层次化、多视角的知识索引。每个条目是从论文摘要(使用gpt-5-mini提取)和全文PDF(通过Vertex AI Batch使用Gemini 3.1 Pro提取)中提取的原子知识声明,并被分类为7种同时存在的层次视图。数据集包含2,337,403个声明,来源于140,913篇论文。声明类型包括反应、属性、方法、机制、比较、计算结果、局限性、假设、意外发现、范围条目、未来方向、实验设计和结构等13种。视图包括按反应类型、按物质类别、按应用领域、按技术、按机制、按声明类型和按时间段等7种。数据集文件包括SQLite数据库、JSONL文件等。
AskChem is a hierarchical, multi-view knowledge index for chemistry research. Each entry is an atomic knowledge claim extracted from both paper abstracts (using gpt-5-mini) and full PDFs (using Gemini 3.1 Pro via Vertex AI Batch) and classified into 7 simultaneous hierarchical views. The dataset contains 2,337,403 claims from 140,913 source papers. Claim types include reaction, property, method, mechanism, comparison, computational_result, limitation, hypothesis, surprising_finding, scope_entry, future_direction, experimental_design, and structure. Views include by_reaction_type, by_substance_class, by_application, by_technique, by_mechanism, by_claim_type, and by_time_period. Dataset files include SQLite database, JSONL files, etc.
提供机构:
bing-yan



