MOF-ChemUnity: Literature-Informed Large Language Models for Metal–Organic Framework Research
收藏NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://figshare.com/articles/dataset/MOF-ChemUnity_Literature-Informed_Large_Language_Models_for_Metal_Organic_Framework_Research/30583732
下载链接
链接失效反馈官方服务:
资源简介:
Artificial intelligence (AI) is transforming research
in metal–organic
frameworks (MOFs), where models trained on structured computational
data routinely predict new materials and optimize their properties.
This raises a central question: What if we could leverage the full
breadth of MOF knowledge, not just structured data sets, but also
the scientific literature? For researchers, the literature remains
the primary source of knowledge, yet much of its content, including
experimental data and expert insight, remains underutilized by AI
systems. We introduce MOF-ChemUnity, a structured, extensible, and
scalable knowledge graph that unifies MOF data by linking literature-derived
insights to crystal structures and computational data sets. By disambiguating
MOF names in the literature and connecting them to crystal structures
in the Cambridge Structural Database, MOF-ChemUnity unifies experimental
and computational sources and enables cross-document knowledge extraction
and linking. We showcase how this enables multiproperty machine learning
across simulated and experimental data, compilation of complete synthesis
records for individual compounds by aggregating information across
multiple publications, and expert-guided materials recommendations
via structure-based machine learning descriptors for pore geometry
and chemistry. When used as a knowledge source to augment large language
models (LLMs), MOF-ChemUnity enables a literature-informed AI assistant
that operates over the full scope of MOF knowledge. Expert evaluations
show improved accuracy, interpretability, and trustworthiness across
tasks such as retrieval, inference of structure–property relationships,
and materials recommendation, outperforming standard LLMs. This work
lays the foundation for literature-informed materials discovery, enabling
both scientists and AI systems to reason over the full existing knowledge
in a new way.
创建时间:
2025-11-10



