Amian/FinLongDocQA

Name: Amian/FinLongDocQA
Creator: Amian
Published: 2026-03-25 05:42:25
License: 暂无描述

Hugging Face2026-03-25 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/Amian/FinLongDocQA

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: other task_categories: - question-answering language: - en tags: - financial - numerical-reasoning - long-document - table-qa - multi-table - annual-reports pretty_name: FinLongDocQA size_categories: - 1K<n<10K configs: - config_name: default data_files: - split: test path: dataset_qa.jsonl --- # FinLongDocQA **Numerical Reasoning across Multiple Tables for Document-Level Financial Question Answering** [![Dataset on HuggingFace](https://img.shields.io/badge/HuggingFace-FinLongDocQA-yellow?logo=huggingface)](https://huggingface.co/datasets/Amian/FinLongDocQA) ## Dataset Description ![An example QA instance from FinLongDocQA](assets/example.jpg) *An example QA instance from FinLongDocQA. The figure shows only the relevant tables and text for presentation; in practice, the model must retrieve them from the full annual report before computing the answer.* FinLongDocQA is a benchmark for financial numerical reasoning over long, structured annual reports. It covers both **single-table** and **cross-table** settings where answering a question requires integrating evidence scattered across multiple tables and narrative text. Financial annual reports commonly exceed 129k tokens, making it challenging for LLMs to (1) locate the relevant tables (*context rot*) and (2) perform accurate multi-step arithmetic once the evidence is found. FinLongDocQA is designed to stress-test both capabilities. ### Dataset Summary | Field | Value | |---|---| | Examples | 7,527 | | Companies | 489 | | Fiscal years | 2022, 2023, 2024 | | Question types | `mixed` (5,951), `table` (1,319), `text` (257) | ### Question Types | Type | Description | |---|---| | `table` | Evidence comes entirely from one or more financial tables | | `text` | Evidence comes entirely from narrative text | | `mixed` | Evidence spans both tables and narrative text | ## Dataset Structure Each record in `dataset_qa.jsonl` contains: ```json { "id": "1", "company": "A", "year": "2022", "question": "On average, how many manufacturing facilities does each business segment have?", "type": "mixed", "thoughts": "Thought: Page 4 cites 3 segments. Page 11 lists 4 U.S. and 4 non-U.S. manufacturing facilities = 8 total. Average = 8/3.", "page_numbers": [4, 11], "python_code": "total_facilities=8\nsegments=3\navg=total_facilities/segments\nround(avg,2)", "answer": 2.67 } ``` ### Fields | Field | Type | Description | |---|---|---| | `id` | string | Unique example identifier | | `company` | string | Anonymized company ticker | | `year` | string | Fiscal year of the annual report | | `question` | string | Natural-language financial question | | `type` | string | Question type: `table`, `text`, or `mixed` | | `thoughts` | string | Chain-of-thought reasoning trace with page references | | `page_numbers` | list[int] | Pages in the annual report that contain the relevant evidence | | `python_code` | string | Executable Python snippet that computes the answer | | `answer` | float | Ground-truth numerical answer | ## Usage ```python from datasets import load_dataset ds = load_dataset("Amian/FinLongDocQA") print(ds["test"][0]) ``` ## License This dataset is released under the **AI²Lab Source Code License (National Taiwan University)**. See the full license [here](LICENSE).

--- 许可证：其他任务类别： - 问答（question-answering）语言： - 英语（en）标签： - 金融（financial） - 数值推理（numerical-reasoning） - 长文档（long-document） - 表格问答（table-qa） - 多表格（multi-table） - 年度报告（annual-reports）美观名称：FinLongDocQA 规模类别： - 1K<n<10K 配置项： - 配置名称：default 数据文件： - 拆分集：test 路径：dataset_qa.jsonl --- # FinLongDocQA **面向多表格的文档级金融问答数值推理基准** [![HuggingFace数据集 FinLongDocQA](https://img.shields.io/badge/HuggingFace-FinLongDocQA-yellow?logo=huggingface)](https://huggingface.co/datasets/Amian/FinLongDocQA) ## 数据集说明 ![FinLongDocQA中的问答示例](assets/example.jpg) *FinLongDocQA中的一则问答示例。本图仅展示相关表格与文本以作演示；实际应用中，模型需先从完整年度报告中检索到对应内容，再计算最终答案。* FinLongDocQA是一款面向长格式结构化年度报告的金融数值推理基准数据集。其涵盖**单表格**与**跨表格**两类场景，在这类场景中，回答问题需要整合分散在多个表格及叙述性文本中的证据。金融年度报告的Token数通常超过129000，这给大语言模型（Large Language Model）带来了两大挑战：(1) 定位相关表格（即*上下文迷失（context rot）*）；(2) 在找到证据后执行准确的多步算术运算。FinLongDocQA正是为了对这两项能力进行压力测试而设计的。 ### 数据集概览 | 字段 | 数值 | |---|---| | 示例总数 | 7,527 | | 覆盖企业 | 489家 | | 财年范围 | 2022、2023、2024 | | 问题类型 | 混合型（5,951条）、表格型（1,319条）、文本型（257条） | ### 问题类型 | 类型 | 说明 | |---|---| | `table` | 答案证据完全来自一个或多个金融表格 | | `text` | 答案证据完全来自叙述性文本 | | `mixed` | 答案证据同时涵盖表格与叙述性文本 | ## 数据集结构 `dataset_qa.jsonl`中的每条记录均包含以下内容： json { "id": "1", "company": "A", "year": "2022", "question": "各业务板块平均拥有多少家生产工厂？", "type": "mixed", "thoughts": "思考过程：第4页提及3个业务板块。第11页列出了4家美国本土及4家海外生产工厂，总计8家。平均数量=8/3。", "page_numbers": [4, 11], "python_code": "total_facilities=8 segments=3 avg=total_facilities/segments round(avg,2)", "answer": 2.67 } ### 字段说明 | 字段名 | 数据类型 | 说明 | |---|---|---| | `id` | 字符串 | 唯一的示例标识符 | | `company` | 字符串 | 匿名化处理后的企业股票代码 | | `year` | 字符串 | 对应年度报告的财年 | | `question` | 字符串 | 自然语言形式的金融问题 | | `type` | 字符串 | 问题类型：`table`、`text`或`mixed` | | `thoughts` | 字符串 | 带页码引用的链式思考推理轨迹 | | `page_numbers` | 整数列表 | 年度报告中包含相关证据的页码 | | `python_code` | 字符串 | 可执行的Python代码片段，用于计算最终答案 | | `answer` | 浮点数 | 标注的真实数值答案 | ## 使用方法 python from datasets import load_dataset ds = load_dataset("Amian/FinLongDocQA") print(ds["test"][0]) ## 许可证本数据集采用**AI²Lab源代码许可协议（国立台湾大学）**发布。完整许可证内容请参见[LICENSE](LICENSE).

提供机构：

Amian

5,000+

优质数据集

54 个

任务类型

进入经典数据集