five

Amian/FinLongDocQA

收藏
Hugging Face2026-03-25 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/Amian/FinLongDocQA
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: other task_categories: - question-answering language: - en tags: - financial - numerical-reasoning - long-document - table-qa - multi-table - annual-reports pretty_name: FinLongDocQA size_categories: - 1K<n<10K configs: - config_name: default data_files: - split: test path: dataset_qa.jsonl --- # FinLongDocQA **Numerical Reasoning across Multiple Tables for Document-Level Financial Question Answering** [![Dataset on HuggingFace](https://img.shields.io/badge/HuggingFace-FinLongDocQA-yellow?logo=huggingface)](https://huggingface.co/datasets/Amian/FinLongDocQA) ## Dataset Description ![An example QA instance from FinLongDocQA](assets/example.jpg) *An example QA instance from FinLongDocQA. The figure shows only the relevant tables and text for presentation; in practice, the model must retrieve them from the full annual report before computing the answer.* FinLongDocQA is a benchmark for financial numerical reasoning over long, structured annual reports. It covers both **single-table** and **cross-table** settings where answering a question requires integrating evidence scattered across multiple tables and narrative text. Financial annual reports commonly exceed 129k tokens, making it challenging for LLMs to (1) locate the relevant tables (*context rot*) and (2) perform accurate multi-step arithmetic once the evidence is found. FinLongDocQA is designed to stress-test both capabilities. ### Dataset Summary | Field | Value | |---|---| | Examples | 7,527 | | Companies | 489 | | Fiscal years | 2022, 2023, 2024 | | Question types | `mixed` (5,951), `table` (1,319), `text` (257) | ### Question Types | Type | Description | |---|---| | `table` | Evidence comes entirely from one or more financial tables | | `text` | Evidence comes entirely from narrative text | | `mixed` | Evidence spans both tables and narrative text | ## Dataset Structure Each record in `dataset_qa.jsonl` contains: ```json { "id": "1", "company": "A", "year": "2022", "question": "On average, how many manufacturing facilities does each business segment have?", "type": "mixed", "thoughts": "Thought: Page 4 cites 3 segments. Page 11 lists 4 U.S. and 4 non-U.S. manufacturing facilities = 8 total. Average = 8/3.", "page_numbers": [4, 11], "python_code": "total_facilities=8\nsegments=3\navg=total_facilities/segments\nround(avg,2)", "answer": 2.67 } ``` ### Fields | Field | Type | Description | |---|---|---| | `id` | string | Unique example identifier | | `company` | string | Anonymized company ticker | | `year` | string | Fiscal year of the annual report | | `question` | string | Natural-language financial question | | `type` | string | Question type: `table`, `text`, or `mixed` | | `thoughts` | string | Chain-of-thought reasoning trace with page references | | `page_numbers` | list[int] | Pages in the annual report that contain the relevant evidence | | `python_code` | string | Executable Python snippet that computes the answer | | `answer` | float | Ground-truth numerical answer | ## Usage ```python from datasets import load_dataset ds = load_dataset("Amian/FinLongDocQA") print(ds["test"][0]) ``` ## License This dataset is released under the **AI²Lab Source Code License (National Taiwan University)**. See the full license [here](LICENSE).

--- 许可证:其他 任务类别: - 问答(question-answering) 语言: - 英语(en) 标签: - 金融(financial) - 数值推理(numerical-reasoning) - 长文档(long-document) - 表格问答(table-qa) - 多表格(multi-table) - 年度报告(annual-reports) 美观名称:FinLongDocQA 规模类别: - 1K<n<10K 配置项: - 配置名称:default 数据文件: - 拆分集:test 路径:dataset_qa.jsonl --- # FinLongDocQA **面向多表格的文档级金融问答数值推理基准** [![HuggingFace数据集 FinLongDocQA](https://img.shields.io/badge/HuggingFace-FinLongDocQA-yellow?logo=huggingface)](https://huggingface.co/datasets/Amian/FinLongDocQA) ## 数据集说明 ![FinLongDocQA中的问答示例](assets/example.jpg) *FinLongDocQA中的一则问答示例。本图仅展示相关表格与文本以作演示;实际应用中,模型需先从完整年度报告中检索到对应内容,再计算最终答案。* FinLongDocQA是一款面向长格式结构化年度报告的金融数值推理基准数据集。其涵盖**单表格**与**跨表格**两类场景,在这类场景中,回答问题需要整合分散在多个表格及叙述性文本中的证据。 金融年度报告的Token数通常超过129000,这给大语言模型(Large Language Model)带来了两大挑战:(1) 定位相关表格(即*上下文迷失(context rot)*);(2) 在找到证据后执行准确的多步算术运算。FinLongDocQA正是为了对这两项能力进行压力测试而设计的。 ### 数据集概览 | 字段 | 数值 | |---|---| | 示例总数 | 7,527 | | 覆盖企业 | 489家 | | 财年范围 | 2022、2023、2024 | | 问题类型 | 混合型(5,951条)、表格型(1,319条)、文本型(257条) | ### 问题类型 | 类型 | 说明 | |---|---| | `table` | 答案证据完全来自一个或多个金融表格 | | `text` | 答案证据完全来自叙述性文本 | | `mixed` | 答案证据同时涵盖表格与叙述性文本 | ## 数据集结构 `dataset_qa.jsonl`中的每条记录均包含以下内容: json { "id": "1", "company": "A", "year": "2022", "question": "各业务板块平均拥有多少家生产工厂?", "type": "mixed", "thoughts": "思考过程:第4页提及3个业务板块。第11页列出了4家美国本土及4家海外生产工厂,总计8家。平均数量=8/3。", "page_numbers": [4, 11], "python_code": "total_facilities=8 segments=3 avg=total_facilities/segments round(avg,2)", "answer": 2.67 } ### 字段说明 | 字段名 | 数据类型 | 说明 | |---|---|---| | `id` | 字符串 | 唯一的示例标识符 | | `company` | 字符串 | 匿名化处理后的企业股票代码 | | `year` | 字符串 | 对应年度报告的财年 | | `question` | 字符串 | 自然语言形式的金融问题 | | `type` | 字符串 | 问题类型:`table`、`text`或`mixed` | | `thoughts` | 字符串 | 带页码引用的链式思考推理轨迹 | | `page_numbers` | 整数列表 | 年度报告中包含相关证据的页码 | | `python_code` | 字符串 | 可执行的Python代码片段,用于计算最终答案 | | `answer` | 浮点数 | 标注的真实数值答案 | ## 使用方法 python from datasets import load_dataset ds = load_dataset("Amian/FinLongDocQA") print(ds["test"][0]) ## 许可证 本数据集采用**AI²Lab源代码许可协议(国立台湾大学)**发布。完整许可证内容请参见[LICENSE](LICENSE).
提供机构:
Amian
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作