MatCha
收藏魔搭社区2025-12-05 更新2025-09-27 收录
下载链接:
https://modelscope.cn/datasets/FreedomIntelligence/MatCha
下载链接
链接失效反馈官方服务:
资源简介:
## Dataset Description
Materials characterization plays a key role in understanding the processing–microstructure–property relationships that guide material design and optimization. While multimodal large language models (MLLMs) have shown promise in generative and predictive tasks, their ability to interpret **real-world characterization imaging data** remains underexplored.
MatCha is the first benchmark designed specifically for materials characterization image understanding. It provides a comprehensive evaluation framework that reflects real challenges faced by materials scientists.
## Dataset Features
- **1,500 expert-level questions** focused on materials characterization.
- Covers 4 stages of materials research across **21 distinct tasks**.
- Tasks designed to mimic **real-world scientific challenges**.
- Provides **the first systematic evaluation** of MLLMs on materials characterization.
## Dataset Structure
The following is an example from **MatCha**:
- `id`: Question ID.
- `vqa`: Visual question answering list.
- `question`: Question text with options.
- `answer`: Correct answer choice (single letter).
- `options`: Answer choices.
- `topic`: Sub-task label.
- `images`: List of images information. Images are provided in `images.zip`.
- `classification`: Category of the image.
- `image_path`: The path of this image.
- `geometry`: The bounding box of the region in the image.
- `article_info`: The metadata of the article corresponding to this image (if applicable).
- `article_name`: The identification code of the article.
- `title`: The title of the article.
- `authors`: The authors of the article.
- `article_url`: The link of the article.
- `license`: The license of the article.
```json
{
"id": "0-0-ncomms9157_fig2.jpg",
"vqa": [
{
"question": "What does the red circle in the 230 \u00b0C frame indicate regarding the nanorods' crystallization? (A) The maximum diffraction intensity (B) Onset of the first diffraction spot (C) Completion of crystallization (D) Absence of any crystallization",
"answer": "B",
"options": {
"A": "The maximum diffraction intensity",
"B": "Onset of the first diffraction spot",
"C": "Completion of crystallization",
"D": "Absence of any crystallization"
},
"topic": "Physical and Chemical Properties Inference"
}
],
"images": [
{
"classification": "microscopy",
"image_path": "ncomms9157_fig2.jpg",
"geometry": [
{
"x": 43,
"y": 133
},
{
"x": 43,
"y": 250
},
{
"x": 591,
"y": 133
},
{
"x": 591,
"y": 250
}
]
}
],
"article_info": {
"article_name": "ncomms9157",
"title": "Nanoscale size effects in crystallization of metallic glass nanorods | Nature Communications",
"authors": "Sungwoo Sohn, Yeonwoong Jung, Yujun Xie, Chinedum Osuji, Jan Schroers &, Judy J. Cha",
"article_url": "https://www.nature.com/articles/ncomms9157",
"license": "http://creativecommons.org/licenses/by/4.0/"
}
}
```
## Citation
If you find our work helpful, please use the following citation.
```bibtex
@misc{lai2025matcha,
title={Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials Characterization},
author={Zhengzhao Lai and Youbin Zheng and Zhenyang Cai and Haonan Lyu and Jinpu Yang and Hongqing Liang and Yan Hu and Benyou Wang},
year={2025},
eprint={2509.09307},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2509.09307},
}
```
# 数据集描述
材料表征是解析指导材料设计与优化的**加工-微观结构-性能**关联机制的核心环节。尽管多模态大语言模型(Multimodal Large Language Models, MLLMs)在生成式与预测式任务中已展现出应用潜力,但它们对真实世界表征成像数据的解读能力仍未得到充分探索。
MatCha是首个专为材料表征图像理解任务打造的基准测试集,其提供了能够真实反映材料科学家实际工作挑战的全面评估框架。
## 数据集特性
- 包含1500个面向材料表征场景的专家级问题
- 覆盖材料研究的4个阶段,涵盖21项差异化任务
- 所有任务均模拟真实科研场景中的实际挑战
- 首次实现了针对材料表征场景下多模态大语言模型的系统性评估
## 数据集结构
以下为MatCha的一个示例条目:
- `id`:问题唯一标识符
- `vqa`:视觉问答(Visual Question Answering, VQA)列表
- `question`:附带候选选项的问题文本
- `answer`:正确答案选项(仅单个字母)
- `options`:所有候选答案集合
- `topic`:子任务分类标签
- `images`:图像信息列表,全部图像打包存放于`images.zip`压缩包中
- `classification`:图像所属类别
- `image_path`:该图像的文件路径
- `geometry`:图像中目标区域的边界框坐标
- `article_info`:该图像对应学术文献的元数据(如适用)
- `article_name`:文献识别编码
- `title`:学术文献标题
- `authors`:文献作者列表
- `article_url`:文献访问链接
- `license`:文献授权协议
json
{
"id": "0-0-ncomms9157_fig2.jpg",
"vqa": [
{
"question": "What does the red circle in the 230 u00b0C frame indicate regarding the nanorods' crystallization? (A) The maximum diffraction intensity (B) Onset of the first diffraction spot (C) Completion of crystallization (D) Absence of any crystallization",
"answer": "B",
"options": {
"A": "The maximum diffraction intensity",
"B": "Onset of the first diffraction spot",
"C": "Completion of crystallization",
"D": "Absence of any crystallization"
},
"topic": "Physical and Chemical Properties Inference"
}
],
"images": [
{
"classification": "microscopy",
"image_path": "ncomms9157_fig2.jpg",
"geometry": [
{
"x": 43,
"y": 133
},
{
"x": 43,
"y": 250
},
{
"x": 591,
"y": 133
},
{
"x": 591,
"y": 250
}
]
}
],
"article_info": {
"article_name": "ncomms9157",
"title": "Nanoscale size effects in crystallization of metallic glass nanorods | Nature Communications",
"authors": "Sungwoo Sohn, Yeonwoong Jung, Yujun Xie, Chinedum Osuji, Jan Schroers &, Judy J. Cha",
"article_url": "https://www.nature.com/articles/ncomms9157",
"license": "http://creativecommons.org/licenses/by/4.0/"
}
}
## 引用
若您认为本工作对您的研究有所帮助,请采用以下引用格式:
bibtex
@misc{lai2025matcha,
title={Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials Characterization},
author={Zhengzhao Lai and Youbin Zheng and Zhenyang Cai and Haonan Lyu and Jinpu Yang and Hongqing Liang and Yan Hu and Benyou Wang},
year={2025},
eprint={2509.09307},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2509.09307},
}
提供机构:
maas
创建时间:
2025-09-20



