five

MatMech: A Multimodal Dataset for Exploring Causal Mechanisms in Materials Science Literature

收藏
Figshare2025-12-11 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/dataset/MatMech_A_Multimodal_Dataset_for_Exploring_Casual_Mechanisms_in_Materials_Science_Literature/29815979/5
下载链接
链接失效反馈
官方服务:
资源简介:
OverviewThe <b>MatMech Dataset</b> is a large-scale collection of <i>61,200+</i> materials science papers, each represented by a structured JSON file and associated figure images.Each paper is represented not just by its text, but by a detailed JSON structure that maps the scientific findings to the <b>Materials Science Tetrahedron</b> (Processing --&gt; Structure --&gt; Properties --&gt; Performance).This dataset provides both the <b>full raw collection (compressed)</b> and <b>sample cases </b>for quick exploration.Data CompositionThe dataset contains:matmech.zip: Full dataset (\~19 GB) containing all papers, organized by DOI folders. Each folder includes:<br>• <i>paper.json</i>: parsed paper metadata and text content<br>• <i>images/</i>: figures extracted from the PDFcase/: A curated set of small example cases for quick testing and inspectiondataset_summary.json: Machine-readable dataset metadataREADME.md: Human-readable dataset description and usage instructionsThe directory structure is shown as follow:matshare_dataset/<br>│<br>├── README.md<br>├── matshare_full.zip<br>├── case/<br>│ ├── doi_001/<br>│ ├── doi_002/<br>│ └── ...<br>└── data_summary.json<br><br><b>Intended Use Cases</b><br>This dataset supports a wide range of materials informatics and scientific text mining tasks, including:<b>Materials mechanism extraction</b><b>Structure–property reasoning and knowledge graph construction</b><b>Multimodal learning from scientific images</b><b>LLM evaluation on scientific domain understanding</b><b>Automatic captioning and figure–text alignment</b><b>Causal chain prediction and reasoning</b><b>Research trend analysis based on materials categories</b><br><br><b><i>For more information, please refer to the </i></b><i>README.md</i>
提供机构:
Liu, Yinpeng
创建时间:
2025-12-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作