five

FCMBench-Data

收藏
魔搭社区2026-05-09 更新2026-05-10 收录
下载链接:
https://modelscope.cn/datasets/QFIN/FCMBench-Data
下载链接
链接失效反馈
官方服务:
资源简介:
![](assets/FCMBench_logo.jpg) **FCMBench** is a multimodal benchmark for credit-risk–oriented workflows. It aims to provide a standard playground to promote collaborative development between academia and industry and provides standardized datasets, prompts, and evaluation scripts across multiple tracks (image, video, speech, agents, etc.) <p align="center"> 💻 <a href="https://github.com/QFIN-tech/FCMBench/tree/main"><b>GitHub</b></a>&nbsp;&nbsp; | &nbsp;&nbsp;🤗 <a href="https://huggingface.co/datasets/QFIN/FCMBench-Data"><b>Hugging Face</b></a>&nbsp;&nbsp; | &nbsp;&nbsp;📑 <a href="https://arxiv.org/abs/2601.00150"><b>FCMBench Paper</b></a>&nbsp;&nbsp; | &nbsp;&nbsp;📑 <a href="https://arxiv.org/abs/2604.25186"><b>FCMBench-Video Paper</b></a>&nbsp;&nbsp; | &nbsp;&nbsp;🏆 <a href="https://qfin-tech.github.io/FCMBench"><b>Leaderboard</b></a>&nbsp;&nbsp; | &nbsp;&nbsp;🌐 <a href="./README_cn.md"><b>简体中文</b></a> </p> ## 🔥 News - 【**2026. 04. 29**】🎬 We released **FCMBench-Video**, a benchmark for document-video intelligence. Built from 495 captured atomic videos and composed into 1,200 long-form videos with 11,322 QA instances across 28 document types (bilingual CN/EN). Paper: [arXiv 2604.25186](https://arxiv.org/abs/2604.25186). - 【**2026. 03. 16**】✨ We released **FCMBench-V1.1**. This version adds English document images and corresponding QA pairs, expands the covered document types to 26, and increases the dataset to 5,198 images and 13,806 QA samples. - 【**2026. 01. 01**】We are proud to launch **FCMBench-V1.0**, which covers 18 core certificate types, including 4,043 privacy-compliant images and 8,446 QA samples. It involves 3 types of Perception tasks and 4 types of Reasoning tasks, which are cross-referenced with 10 categories of robustness inferences. All the tasks and inferences are derived from real-world critical scenarios. > **Status:** Public release (v1.1).<br> > **Maintainers:** [奇富科技 / Qfin Holdings](https://github.com/QFIN-tech)<br> > **Contact:** [yangyehuisw@126.com] --- ## Tracks Overview | Entry | Inputs | Outputs | Evaluation Script | Leaderboard | Paper | Sample Data | |---|---|---|---|---|---|---| | [Vision-Language Track](vision_language) | document images + text prompts (JSONL, one sample per line) | text responses (JSONL, one sample per line) | [evaluation.py](vision_language/evaluation.py) | [Leaderboard](https://qfin-tech.github.io/FCMBench) | [arXiv 2601.00150](https://arxiv.org/abs/2601.00150) | [Examples](https://qfin-tech.github.io/FCMBench/Examples.html) | | [Video Understanding Track](video_understanding) | document videos + text prompts (JSONL) | text responses (JSONL) | [benchmark_eval.py](video_understanding/benchmark_eval.py) | via [submission](video_understanding/README.md#leaderboard) | [arXiv 2604.25186](https://arxiv.org/abs/2604.25186) | see [README](video_understanding/README.md) | --- ### 1) Vision-Language Track (✅ Available) Image-based financial document understanding. #### Sample Data Preview sample images and QA examples on the [Examples page](https://qfin-tech.github.io/FCMBench/Examples.html). #### Reference Model Demo We also provide access to an interactive demo of our Qfin-VL-Instruct model, which achieves strong performance on FCMBench. If you are interested in trying the Gradio demo, please contact [yangyehuisw@126.com] with the following information: - Name - Affiliation / Organization - Intended use (e.g., research exploration, benchmarking reference) - Contact email Access will be granted on a case-by-case basis. --- ### 2) Video Understanding Track (🎬 Available) Document-video intelligence benchmark covering document perception, temporal grounding, and evidence-grounded reasoning under realistic handheld capture conditions. Built from 495 captured atomic videos composed into 1,200 long-form videos (20s/40s/60s duration tiers) with 11,322 expert-annotated QA instances across 28 document types in bilingual Chinese/English settings. See the [paper](https://arxiv.org/abs/2604.25186) for full benchmark details and evaluation results on nine Video-MLLMs. #### Sample Data Please refer to the [Video Understanding track README](video_understanding/README.md) for the full data composition, instruction file descriptions, and quickstart guide. A stratified 10% subset with ground-truth (`FCMBench-Video_v1.0_small.jsonl`) is available for self-evaluation. #### Reference Model Demo *(TBD)* --- ### 3) Speech Understanding & Generation Track (🕒 Coming Soon) ### 4) Multi-step / Agentic Track (🕒 Coming Soon) ## Citation **FCMBench (Vision-Language Track):** ``` @misc{yang2026fcmbenchlargescalefinancialcredit, title={FCMBench: The First Large-scale Financial Credit Multimodal Benchmark for Real-world Applications}, author={Yehui Yang and Dalu Yang and Fangxin Shang and Wenshuo Zhou and Jie Ren and Yifan Liu and Haojun Fei and Qing Yang and Yanwu Xu and Tao Chen}, year={2026}, eprint={2601.00150}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2601.00150}, } ``` **FCMBench-Video (Video Understanding Track):** ``` @misc{cui2026fcmbenchvideobenchmarkingdocumentvideo, title={FCMBench-Video: Benchmarking Document Video Intelligence}, author={Runze Cui and Fangxin Shang and Yehui Yang and Qing Yang and Tao Chen}, year={2026}, eprint={2604.25186}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2604.25186}, } ``` ## Star History [![Star History Chart](https://api.star-history.com/svg?repos=QFIN-tech/FCMBench&type=date&legend=top-left)](https://www.star-history.com/#QFIN-tech/FCMBench&type=date&legend=top-left)

![](assets/FCMBench_logo.jpg) **FCMBench** 是面向信贷风险相关工作流的多模态基准测试集。其旨在打造标准化试验平台,以推动学术界与工业界的协同研发,并覆盖图像、视频、语音、AI智能体(AI Agent)等多个赛道,提供标准化数据集、提示词(Prompt)与评估脚本。 <p align="center"> 💻 <a href="https://github.com/QFIN-tech/FCMBench/tree/main"><b>GitHub</b></a>&nbsp;&nbsp; | &nbsp;&nbsp;🤗 <a href="https://huggingface.co/datasets/QFIN/FCMBench-Data"><b>Hugging Face</b></a>&nbsp;&nbsp; | &nbsp;&nbsp;📑 <a href="https://arxiv.org/abs/2601.00150"><b>FCMBench 论文</b></a>&nbsp;&nbsp; | &nbsp;&nbsp;📑 <a href="https://arxiv.org/abs/2604.25186"><b>FCMBench-Video 论文</b></a>&nbsp;&nbsp; | &nbsp;&nbsp;🏆 <a href="https://qfin-tech.github.io/FCMBench"><b>排行榜</b></a>&nbsp;&nbsp; | &nbsp;&nbsp;🌐 <a href="./README_cn.md"><b>简体中文</b></a> </p> ## 🔥 最新动态 - 【**2026. 04. 29**】🎬 我们发布了**FCMBench-Video**,一款面向文档视频智能的基准测试集。该数据集由495条采集的原子视频合成1200条长视频,涵盖28种文档类型的11322个问答(QA)实例,支持中英双语。相关论文:[arXiv 2604.25186](https://arxiv.org/abs/2604.25186)。 - 【**2026. 03. 16**】✨ 我们发布了**FCMBench-V1.1**。该版本新增英文文档图像及对应问答对,覆盖文档类型拓展至26种,数据集规模提升至5198张图像与13806个问答样本。 - 【**2026. 01. 01**】我们荣幸推出**FCMBench-V1.0**,涵盖18种核心凭证类型,包含4043张符合隐私规范的图像与8446个问答样本。其涉及3类感知任务与4类推理任务,并与10类鲁棒性推理交叉关联。所有任务与推理均源自真实世界的关键业务场景。 > **项目状态:** 正式发布(v1.1版本)。<br> > **维护方:** [奇富科技 / Qfin Holdings](https://github.com/QFIN-tech)<br> > **联系方式:** [yangyehuisw@126.com] --- ## 赛道总览 | 赛道条目 | 输入数据 | 输出数据 | 评估脚本 | 排行榜 | 相关论文 | 示例数据 | |---|---|---|---|---|---|---| | [视觉语言赛道](vision_language) | 文档图像 + 文本提示词(JSONL格式,每行一个样本) | 文本回复(JSONL格式,每行一个样本) | [evaluation.py](vision_language/evaluation.py) | [排行榜](https://qfin-tech.github.io/FCMBench) | [arXiv 2601.00150](https://arxiv.org/abs/2601.00150) | [示例页面](https://qfin-tech.github.io/FCMBench/Examples.html) | | [视频理解赛道](video_understanding) | 文档视频 + 文本提示词(JSONL格式) | 文本回复(JSONL格式) | [benchmark_eval.py](video_understanding/benchmark_eval.py) | 需通过[提交流程](video_understanding/README.md#leaderboard)参与 | [arXiv 2604.25186](https://arxiv.org/abs/2604.25186) | 详见[README文档](video_understanding/README.md) | --- ### 1) 视觉语言赛道(✅ 已上线) 基于图像的金融文档理解任务。 #### 示例数据 可在[示例页面](https://qfin-tech.github.io/FCMBench/Examples.html)预览示例图像与问答样例。 #### 参考模型演示 我们还提供了Qfin-VL-Instruct模型的交互式演示入口,该模型在FCMBench上表现优异。若您希望体验Gradio演示,请通过以下信息联系[yangyehuisw@126.com]: - 姓名 - 所属机构/组织 - 用途说明(例如:研究探索、基准测试参考) - 联系邮箱 申请将根据具体情况逐一审核通过。 --- ### 2) 视频理解赛道(🎬 已上线) 面向真实手持拍摄场景下的文档感知、时序定位与证据驱动推理的文档视频智能基准测试集。该数据集由495条采集的原子视频合成1200条长视频(分为20秒、40秒、60秒三个时长档位),涵盖28种文档类型的11322个专家标注问答实例,支持中英双语设置。完整基准详情与9款视频多模态大语言模型的评估结果详见[相关论文](https://arxiv.org/abs/2604.25186)。 #### 示例数据 完整的数据构成、提示文件说明与快速入门指南请参阅[视频理解赛道README文档](video_understanding/README.md)。同时提供带标注的分层10%子集数据(`FCMBench-Video_v1.0_small.jsonl`)供自主评估使用。 #### 参考模型演示 *(待开发)* --- ### 3) 语音理解与生成赛道(🕒 即将上线) ### 4) 多步任务 / AI智能体赛道(🕒 即将上线) --- ## 引用规范 **FCMBench(视觉语言赛道):** @misc{yang2026fcmbenchlargescalefinancialcredit, title={FCMBench: The First Large-scale Financial Credit Multimodal Benchmark for Real-world Applications}, author={Yehui Yang and Dalu Yang and Fangxin Shang and Wenshuo Zhou and Jie Ren and Yifan Liu and Haojun Fei and Qing Yang and Yanwu Xu and Tao Chen}, year={2026}, eprint={2601.00150}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2601.00150}, } **FCMBench-Video(视频理解赛道):** @misc{cui2026fcmbenchvideobenchmarkingdocumentvideo, title={FCMBench-Video: Benchmarking Document Video Intelligence}, author={Runze Cui and Fangxin Shang and Yehui Yang and Qing Yang and Tao Chen}, year={2026}, eprint={2604.25186}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2604.25186}, } --- ## 星标历史 [![Star History Chart](https://api.star-history.com/svg?repos=QFIN-tech/FCMBench&type=date&legend=top-left)](https://www.star-history.com/#QFIN-tech/FCMBench&type=date&legend=top-left)
提供机构:
maas
创建时间:
2026-03-16
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作