MCIF

Name: MCIF
Creator: maas
Published: 2025-12-05 16:55:11
License: 暂无描述

魔搭社区2025-12-05 更新2025-11-03 收录

下载链接：

https://modelscope.cn/datasets/FBK-MT/MCIF

下载链接

链接失效反馈

官方服务：

资源简介：

<p align="center"> <img src="./mcif_logo.png" width="600"> </p> ### Dataset Description, Collection, and Source MCIF (Multimodal Crosslingual Instruction Following) is a multilingual human-annotated benchmark based on scientific talks that is designed to evaluate instruction-following in crosslingual, multimodal settings over both short- and long-form inputs. MCIF spans three core modalities -- speech, vision, and text -- and four diverse languages (English, German, Italian, and Chinese), enabling a comprehensive evaluation of MLLMs' abilities to interpret instructions across languages and combine them with multimodal contextual information. ### License - CC-BY-4.0 ### Dataset Sources - **Repository:** [MCIF](https://github.com/hlt-mt/mcif) - **Paper:** [MCIF: Multimodal Crosslingual Instruction-Following Benchmark from Scientific Talks](https://arxiv.org/abs/2507.19634) ## Dataset Structure ### Data Config This dataset contains **4 splits** organized by three dimensions following the split naming convention `{track}_{prompt_type}`. Track - Input duration: * `long`: Full-length, unsegmented inputs * `short`: Pre-segmented inputs Prompt Type - Prompt variation: * `fixed`: Standardized prompts across all examples * `mixed`: Includes prompt variations Please note that all splits share the same set of original input audio and video files. The splits are meant to facilitate testing various use cases. ### Dataset Fields | **Field** | **Type** | **Description** | |-----------------|------------|-----------------------------------------------| | `id` | `string` | Unique identifier for the sample, it starts with `QA` (question answering), `SUM` (summarization), `ASR` (transcription), or `TRANS` (translation). | | `audio` | `str` | In the `long` track: path to full talk-level audio. In the `short` track: path to pre-segmented audio. | | `video` | `str` | In the `long` track: path to full talk-level video. In the `short` track: path to pre-segmented video. | | `text` | `string` | Transcript of input. Only present in the `long` track. | | `prompt_{en, de, it, zh}` | `string` | Instruction in English, German, Italian, or Chinese. | | `metadata` | `string` | Meta data for question answering samples, in the format {qa_type={`A` (audio), `V` (visual), `AV` (audio-visual), `NA` (not answerable)} qa_origin={`Transcript`, `Abstract`, `General`}} | The audio/video paths are relative within this repo. You can download the data by cloning this repo: ``` git clone https://huggingface.co/datasets/FBK-MT/MCIF ``` ### References The references are available in `MCIF.{short,long}.{en,de,it,zh}.ref.xml.gz` (navigate to "Files and versions" tab or clone this repo). ### IWSLT 2025 Version Part of MCIF was used in the [IWSLT 2025 instruction-following track](https://iwslt.org/2025/instruction-following). This test data is available under branch `IWSLT2025`. You can access it by ``` dataset = load_dataset("FBK-MT/MCIF", "{en,de,it,zh}_{long,short}", revision="IWSLT2025") ``` ## Evaluation Please use the official evaluation scripts from the [MCIF GitHub Repo](https://github.com/hlt-mt/mcif). The references are also available there. ## Changelog ### Version 1.1 - Fixed German summarization prompt - Renamed files not to include version name in the filename ## Citation ``` @misc{papi2025mcifmultimodalcrosslingualinstructionfollowing, title={MCIF: Multimodal Crosslingual Instruction-Following Benchmark from Scientific Talks}, author={Sara Papi and Maike Züfle and Marco Gaido and Beatrice Savoldi and Danni Liu and Ioannis Douros and Luisa Bentivogli and Jan Niehues}, year={2025}, eprint={2507.19634}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2507.19634}, } ``` ## Dataset Card Contact [@spapi](https://huggingface.co/spapi) and [@danniliu](https://huggingface.co/danniliu)

<p align="center"><img src="./mcif_logo.png" width="600"></p> ### 数据集描述、收集与来源 MCIF（Multimodal Crosslingual Instruction Following，多模态跨语言指令遵循）是一个基于学术演讲的多语言人工标注基准测试集，旨在评估大语言模型（LLM）在跨语言、多模态场景下对长短格式输入的指令遵循能力。MCIF涵盖三大核心模态——语音、视觉与文本，以及四种覆盖广泛的语言（英语、德语、意大利语与中文），能够全面评估多模态大语言模型（Multimodal Large Language Model，MLLM）跨语言理解指令，并结合多模态上下文信息的能力。 ### 许可证 - CC-BY-4.0 ### 数据集来源 - **代码仓库**：[MCIF](https://github.com/hlt-mt/mcif) - **相关论文**：[MCIF: Multimodal Crosslingual Instruction-Following Benchmark from Scientific Talks](https://arxiv.org/abs/2507.19634) ## 数据集结构 ### 数据配置本数据集包含**4个拆分集**，按照三个维度进行组织，拆分命名遵循`{track}_{prompt_type}`的格式约定。 - 赛道（Track）：输入时长维度 * `long`：完整未分段的输入内容 * `short`：预先分段的输入内容 - 提示类型（Prompt Type）：提示变体类型 * `fixed`：所有样本统一使用标准化提示词 * `mixed`：包含多种变体的提示词请注意，所有拆分集共享同一套原始输入音频与视频文件，设置不同拆分集旨在便于测试多样化的应用场景。 ### 数据集字段 | **字段名** | **数据类型** | **字段说明** | |-----------------|------------|-----------------------------------------------| | `id` | `string` | 样本唯一标识符，前缀以`QA`（问答）、`SUM`（摘要生成）、`ASR`（语音转录）或`TRANS`（翻译）开头。 | | `audio` | `str` | 对于`long`赛道：指向完整演讲级音频文件的路径；对于`short`赛道：指向预先分段音频文件的路径。 | | `video` | `str` | 对于`long`赛道：指向完整演讲级视频文件的路径；对于`short`赛道：指向预先分段视频文件的路径。 | | `text` | `string` | 输入内容的转录文本，仅在`long`赛道中存在。 | | `prompt_{en, de, it, zh}` | `string` | 分别为英语、德语、意大利语或中文的指令提示词。 | | `metadata` | `string` | 问答样本的元数据，格式为`{qa_type={`A`（音频）, `V`（视觉）, `AV`（音视频）, `NA`（无法回答）}, qa_origin={`Transcript`, `Abstract`, `General`}}` | 音频与视频路径均为本仓库内的相对路径。你可以通过克隆本仓库下载数据： git clone https://huggingface.co/datasets/FBK-MT/MCIF ### 参考数据集参考数据集存储于`MCIF.{short,long}.{en,de,it,zh}.ref.xml.gz`（可前往仓库的"Files and versions"标签页或克隆本仓库获取）。 ### IWSLT 2025 版本 MCIF的部分内容被应用于[IWSLT 2025指令遵循赛道](https://iwslt.org/2025/instruction-following)。该测试数据位于`IWSLT2025`分支下，你可以通过以下代码访问该数据集： dataset = load_dataset("FBK-MT/MCIF", "{en,de,it,zh}_{long,short}", revision="IWSLT2025") ## 评估请使用[MCIF GitHub仓库](https://github.com/hlt-mt/mcif)提供的官方评估脚本，参考数据集也可在该仓库中获取。 ## 变更日志 ### 版本 1.1 - 修复了德语摘要生成提示词 - 重命名文件，使其文件名中不再包含版本号 ## 引用 @misc{papi2025mcifmultimodalcrosslingualinstructionfollowing, title={MCIF: 基于学术演讲的多模态跨语言指令遵循基准测试集}, author={Sara Papi and Maike Züfle and Marco Gaido and Beatrice Savoldi and Danni Liu and Ioannis Douros and Luisa Bentivogli and Jan Niehues}, year={2025}, eprint={2507.19634}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2507.19634}, } ## 数据集卡片联系人 [@spapi](https://huggingface.co/spapi) 与 [@danniliu](https://huggingface.co/danniliu)

提供机构：

maas

创建时间：

2025-10-22

5,000+

优质数据集

54 个

任务类型

进入经典数据集