InnovatorLab/EMVista

Name: InnovatorLab/EMVista
Creator: InnovatorLab
Published: 2026-02-06 06:20:23
License: 暂无描述

Hugging Face2026-02-06 更新2026-04-05 收录

下载链接：

https://hf-mirror.com/datasets/InnovatorLab/EMVista

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: mit task_categories: - visual-question-answering language: - en tags: - multimodal pretty_name: EMVista size_categories: - 1K<n<10K configs: - config_name: default data_files: - split: test path: data/test.parquet --- # EMVista Dataset <center><h1>EMVista</h1></center> <img src="./assets/pipeline.png" alt="EMVista" style="display: block; margin: auto; max-width: 70%;"> | <a href="https://huggingface.co/datasets/EMVista/EMVista">HuggingFace</a> | <a href="https://huggingface.co/papers/2601.19325">Paper</a> | <a href="https://github.com/InnovatorLM/Innovator-VL">Code</a> | --- ## 🔥 Latest News - **[2026/01]** EMVista v1.0 is officially released.   ## Overview **EMVista** is a benchmark for evaluating **instance-level microstructural understanding** in electron microscopy (EM) images across **three core capability dimensions**: 1. **Microstructural Perception** Evaluates the ability to detect, delineate, and separate individual microstructural instances in complex EM scenes. 2. **Microstructural Attribute Understanding** Measures the capacity to interpret key microstructural attributes, including morphology, density, spatial distribution, layering, and scale variation. 3. **Robustness in Dense Scenes** Assesses model stability and accuracy under extreme instance crowding, overlap, and multi-scale complexity. EMVista contains **expert-annotated EM images** with instance-level labels and structured attribute descriptions, designed to reflect **realistic challenges** in materials microstructure analysis. --- ## Dataset Characteristics - **Task Format**: Visual Question Answering (VQA) - **Modalities**: Image + Text - **Languages**: English - **Annotation**: Expert-verified --- ### Download EMVista Dataset You can download the EMVista dataset using the HuggingFace `datasets` library (make sure you have installed [HuggingFace Datasets](https://huggingface.co/docs/datasets/quickstart)): ```python from datasets import load_dataset dataset = load_dataset("InnovatorLab/EMVista") ``` ## Evaluations We use [lmms-eval](https://github.com/EvolvingLMMs-Lab/lmms-eval) for evaluations. Please see [here](./evaluation/README.md) for detail files. ## License EMVista is released under the MIT License. See [LICENSE](./LICENSE) for more details. ## Citation ```bibtex @article{wen2026innovator, title={Innovator-VL: A Multimodal Large Language Model for Scientific Discovery}, author={Wen, Zichen and Yang, Boxue and Chen, Shuang and Zhang, Yaojie and Han, Yuhang and Ke, Junlong and Wang, Cong and others}, journal={arXiv preprint arXiv:2601.19325}, year={2026} } ```

提供机构：

InnovatorLab

5,000+

优质数据集

54 个

任务类型

进入经典数据集