FCMBench-Data

Name: FCMBench-Data
Creator: maas
Published: 2026-05-09 10:41:18
License: 暂无描述

魔搭社区2026-05-09 更新2026-05-10 收录

下载链接：

https://modelscope.cn/datasets/QFIN/FCMBench-Data

下载链接

链接失效反馈

官方服务：

资源简介：

![](assets/FCMBench_logo.jpg) **FCMBench** is a multimodal benchmark for credit-risk–oriented workflows. It aims to provide a standard playground to promote collaborative development between academia and industry and provides standardized datasets, prompts, and evaluation scripts across multiple tracks (image, video, speech, agents, etc.) 💻 <a href="https://github.com/QFIN-tech/FCMBench/tree/main">GitHub</a>   |   🤗 <a href="https://huggingface.co/datasets/QFIN/FCMBench-Data">Hugging Face</a>   |   📑 <a href="https://arxiv.org/abs/2601.00150">FCMBench Paper</a>   |   📑 <a href="https://arxiv.org/abs/2604.25186">FCMBench-Video Paper</a>   |   🏆 <a href="https://qfin-tech.github.io/FCMBench">Leaderboard</a>   |   🌐 <a href="./README_cn.md">简体中文</a> ## 🔥 News - 【**2026. 04. 29**】🎬 We released **FCMBench-Video**, a benchmark for document-video intelligence. Built from 495 captured atomic videos and composed into 1,200 long-form videos with 11,322 QA instances across 28 document types (bilingual CN/EN). Paper: [arXiv 2604.25186](https://arxiv.org/abs/2604.25186). - 【**2026. 03. 16**】✨ We released **FCMBench-V1.1**. This version adds English document images and corresponding QA pairs, expands the covered document types to 26, and increases the dataset to 5,198 images and 13,806 QA samples. - 【**2026. 01. 01**】We are proud to launch **FCMBench-V1.0**, which covers 18 core certificate types, including 4,043 privacy-compliant images and 8,446 QA samples. It involves 3 types of Perception tasks and 4 types of Reasoning tasks, which are cross-referenced with 10 categories of robustness inferences. All the tasks and inferences are derived from real-world critical scenarios. > **Status:** Public release (v1.1). > **Maintainers:** [奇富科技 / Qfin Holdings](https://github.com/QFIN-tech) > **Contact:** [yangyehuisw@126.com] --- ## Tracks Overview | Entry | Inputs | Outputs | Evaluation Script | Leaderboard | Paper | Sample Data | |---|---|---|---|---|---|---| | [Vision-Language Track](vision_language) | document images + text prompts (JSONL, one sample per line) | text responses (JSONL, one sample per line) | [evaluation.py](vision_language/evaluation.py) | [Leaderboard](https://qfin-tech.github.io/FCMBench) | [arXiv 2601.00150](https://arxiv.org/abs/2601.00150) | [Examples](https://qfin-tech.github.io/FCMBench/Examples.html) | | [Video Understanding Track](video_understanding) | document videos + text prompts (JSONL) | text responses (JSONL) | [benchmark_eval.py](video_understanding/benchmark_eval.py) | via [submission](video_understanding/README.md#leaderboard) | [arXiv 2604.25186](https://arxiv.org/abs/2604.25186) | see [README](video_understanding/README.md) | --- ### 1) Vision-Language Track (✅ Available) Image-based financial document understanding. #### Sample Data Preview sample images and QA examples on the [Examples page](https://qfin-tech.github.io/FCMBench/Examples.html). #### Reference Model Demo We also provide access to an interactive demo of our Qfin-VL-Instruct model, which achieves strong performance on FCMBench. If you are interested in trying the Gradio demo, please contact [yangyehuisw@126.com] with the following information: - Name - Affiliation / Organization - Intended use (e.g., research exploration, benchmarking reference) - Contact email Access will be granted on a case-by-case basis. --- ### 2) Video Understanding Track (🎬 Available) Document-video intelligence benchmark covering document perception, temporal grounding, and evidence-grounded reasoning under realistic handheld capture conditions. Built from 495 captured atomic videos composed into 1,200 long-form videos (20s/40s/60s duration tiers) with 11,322 expert-annotated QA instances across 28 document types in bilingual Chinese/English settings. See the [paper](https://arxiv.org/abs/2604.25186) for full benchmark details and evaluation results on nine Video-MLLMs. #### Sample Data Please refer to the [Video Understanding track README](video_understanding/README.md) for the full data composition, instruction file descriptions, and quickstart guide. A stratified 10% subset with ground-truth (`FCMBench-Video_v1.0_small.jsonl`) is available for self-evaluation. #### Reference Model Demo *(TBD)* --- ### 3) Speech Understanding & Generation Track (🕒 Coming Soon) ### 4) Multi-step / Agentic Track (🕒 Coming Soon) ## Citation **FCMBench (Vision-Language Track):** ``` @misc{yang2026fcmbenchlargescalefinancialcredit, title={FCMBench: The First Large-scale Financial Credit Multimodal Benchmark for Real-world Applications}, author={Yehui Yang and Dalu Yang and Fangxin Shang and Wenshuo Zhou and Jie Ren and Yifan Liu and Haojun Fei and Qing Yang and Yanwu Xu and Tao Chen}, year={2026}, eprint={2601.00150}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2601.00150}, } ``` **FCMBench-Video (Video Understanding Track):** ``` @misc{cui2026fcmbenchvideobenchmarkingdocumentvideo, title={FCMBench-Video: Benchmarking Document Video Intelligence}, author={Runze Cui and Fangxin Shang and Yehui Yang and Qing Yang and Tao Chen}, year={2026}, eprint={2604.25186}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2604.25186}, } ``` ## Star History [![Star History Chart](https://api.star-history.com/svg?repos=QFIN-tech/FCMBench&type=date&legend=top-left)](https://www.star-history.com/#QFIN-tech/FCMBench&type=date&legend=top-left)

![](assets/FCMBench_logo.jpg) **FCMBench** 是面向信贷风险相关工作流的多模态基准测试集。其旨在打造标准化试验平台，以推动学术界与工业界的协同研发，并覆盖图像、视频、语音、AI智能体（AI Agent）等多个赛道，提供标准化数据集、提示词（Prompt）与评估脚本。 💻 <a href="https://github.com/QFIN-tech/FCMBench/tree/main">GitHub</a>   |   🤗 <a href="https://huggingface.co/datasets/QFIN/FCMBench-Data">Hugging Face</a>   |   📑 <a href="https://arxiv.org/abs/2601.00150">FCMBench 论文</a>   |   📑 <a href="https://arxiv.org/abs/2604.25186">FCMBench-Video 论文</a>   |   🏆 <a href="https://qfin-tech.github.io/FCMBench">排行榜</a>   |   🌐 <a href="./README_cn.md">简体中文</a> ## 🔥 最新动态 - 【**2026. 04. 29**】🎬 我们发布了**FCMBench-Video**，一款面向文档视频智能的基准测试集。该数据集由495条采集的原子视频合成1200条长视频，涵盖28种文档类型的11322个问答（QA）实例，支持中英双语。相关论文：[arXiv 2604.25186](https://arxiv.org/abs/2604.25186)。 - 【**2026. 03. 16**】✨ 我们发布了**FCMBench-V1.1**。该版本新增英文文档图像及对应问答对，覆盖文档类型拓展至26种，数据集规模提升至5198张图像与13806个问答样本。 - 【**2026. 01. 01**】我们荣幸推出**FCMBench-V1.0**，涵盖18种核心凭证类型，包含4043张符合隐私规范的图像与8446个问答样本。其涉及3类感知任务与4类推理任务，并与10类鲁棒性推理交叉关联。所有任务与推理均源自真实世界的关键业务场景。 > **项目状态：** 正式发布（v1.1版本）。 > **维护方：** [奇富科技 / Qfin Holdings](https://github.com/QFIN-tech) > **联系方式：** [yangyehuisw@126.com] --- ## 赛道总览 | 赛道条目 | 输入数据 | 输出数据 | 评估脚本 | 排行榜 | 相关论文 | 示例数据 | |---|---|---|---|---|---|---| | [视觉语言赛道](vision_language) | 文档图像 + 文本提示词（JSONL格式，每行一个样本） | 文本回复（JSONL格式，每行一个样本） | [evaluation.py](vision_language/evaluation.py) | [排行榜](https://qfin-tech.github.io/FCMBench) | [arXiv 2601.00150](https://arxiv.org/abs/2601.00150) | [示例页面](https://qfin-tech.github.io/FCMBench/Examples.html) | | [视频理解赛道](video_understanding) | 文档视频 + 文本提示词（JSONL格式） | 文本回复（JSONL格式） | [benchmark_eval.py](video_understanding/benchmark_eval.py) | 需通过[提交流程](video_understanding/README.md#leaderboard)参与 | [arXiv 2604.25186](https://arxiv.org/abs/2604.25186) | 详见[README文档](video_understanding/README.md) | --- ### 1) 视觉语言赛道（✅ 已上线）基于图像的金融文档理解任务。 #### 示例数据可在[示例页面](https://qfin-tech.github.io/FCMBench/Examples.html)预览示例图像与问答样例。 #### 参考模型演示我们还提供了Qfin-VL-Instruct模型的交互式演示入口，该模型在FCMBench上表现优异。若您希望体验Gradio演示，请通过以下信息联系[yangyehuisw@126.com]： - 姓名 - 所属机构/组织 - 用途说明（例如：研究探索、基准测试参考） - 联系邮箱申请将根据具体情况逐一审核通过。 --- ### 2) 视频理解赛道（🎬 已上线）面向真实手持拍摄场景下的文档感知、时序定位与证据驱动推理的文档视频智能基准测试集。该数据集由495条采集的原子视频合成1200条长视频（分为20秒、40秒、60秒三个时长档位），涵盖28种文档类型的11322个专家标注问答实例，支持中英双语设置。完整基准详情与9款视频多模态大语言模型的评估结果详见[相关论文](https://arxiv.org/abs/2604.25186)。 #### 示例数据完整的数据构成、提示文件说明与快速入门指南请参阅[视频理解赛道README文档](video_understanding/README.md)。同时提供带标注的分层10%子集数据（`FCMBench-Video_v1.0_small.jsonl`）供自主评估使用。 #### 参考模型演示 *(待开发)* --- ### 3) 语音理解与生成赛道（🕒 即将上线） ### 4) 多步任务 / AI智能体赛道（🕒 即将上线） --- ## 引用规范 **FCMBench（视觉语言赛道）：** @misc{yang2026fcmbenchlargescalefinancialcredit, title={FCMBench: The First Large-scale Financial Credit Multimodal Benchmark for Real-world Applications}, author={Yehui Yang and Dalu Yang and Fangxin Shang and Wenshuo Zhou and Jie Ren and Yifan Liu and Haojun Fei and Qing Yang and Yanwu Xu and Tao Chen}, year={2026}, eprint={2601.00150}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2601.00150}, } **FCMBench-Video（视频理解赛道）：** @misc{cui2026fcmbenchvideobenchmarkingdocumentvideo, title={FCMBench-Video: Benchmarking Document Video Intelligence}, author={Runze Cui and Fangxin Shang and Yehui Yang and Qing Yang and Tao Chen}, year={2026}, eprint={2604.25186}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2604.25186}, } --- ## 星标历史 [![Star History Chart](https://api.star-history.com/svg?repos=QFIN-tech/FCMBench&type=date&legend=top-left)](https://www.star-history.com/#QFIN-tech/FCMBench&type=date&legend=top-left)

提供机构：

maas

创建时间：

2026-03-16

5,000+

优质数据集

54 个

任务类型

进入经典数据集