five

thinkwee/DDRBench_10K

收藏
Hugging Face2026-02-03 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/thinkwee/DDRBench_10K
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-sa-4.0 task_categories: - table-question-answering language: - en tags: - finance - 10k - edgar - agent - benchmark size_categories: - 1M<n<10M configs: - config_name: column_documentation data_files: - split: train path: data/column_documentation/column_documentation.parquet - config_name: company_addresses data_files: - split: train path: data/company_addresses/company_addresses.parquet - config_name: column_comments data_files: - split: train path: data/column_comments/column_comments.parquet - config_name: sqlite_sequence data_files: - split: train path: data/sqlite_sequence/sqlite_sequence.parquet - config_name: table_documentation data_files: - split: train path: data/table_documentation/table_documentation.parquet - config_name: companies data_files: - split: train path: data/companies/companies.parquet - config_name: filings data_files: - split: train path: data/filings/filings.parquet - config_name: financial_facts data_files: - split: train path: data/financial_facts/financial_facts.parquet - config_name: company_tickers data_files: - split: train path: data/company_tickers/company_tickers.parquet - config_name: table_comments data_files: - split: train path: data/table_comments/table_comments.parquet --- # DDRBench: Deep Data Research Benchmark [**📊 Leaderboard & Demo**](https://huggingface.co/spaces/thinkwee/DDR_Bench) | [**📄 Paper (Arxiv)**](https://arxiv.org/abs/2602.02039) ## Overview **DDRBench (Deep Data Research Benchmark)** is a comprehensive evaluation framework designed to assess the capabilities of Large Language Model (LLM) agents in performing complex, multi-turn data research and reasoning tasks. Unlike traditional Q&A benchmarks, DDRBench focuses on scenarios requiring deep interaction with structured databases, tool usage, and long-context reasoning. This dataset repository specifically hosts the **10-K Financial Database**, a core component of the DDRBench suite. It contains structured financial data extracted from SEC 10-K filings, enabling agents to answer intricate financial questions that mimic real-world analyst workflows. ## Dataset Structure The dataset is organized into multiple configurations (subsets), representing different tables from the underlying SQLite database: * **`financial_facts`**: The primary table containing over 5 million financial metrics (US-GAAP, IFRS) with values, units, and fiscal periods. * **`companies`**: Registry of companies with CIK, names, and SIC codes. * **`filings`**: Metadata for the SEC filings source documents. * **`company_addresses`** & **`company_tickers`**: Geographic and market identification data. * **`table_documentation`** & **`column_documentation`**: Meta-information describing the database schema to the agents. ## Usage ### Data Inspection Load specific tables using the `datasets` library: ```python from datasets import load_dataset # Load the main financial facts table financial_facts = load_dataset("thinkwee/DDRBench_10K", "financial_facts") # Load company information companies = load_dataset("thinkwee/DDRBench_10K", "companies") ``` For agent trajectories and evaluation logs, please refer to the [DDRBench Trajectory Dataset](https://huggingface.co/datasets/thinkwee/DDRBench_10K_trajectory). ### Run Deep Data Research Please use the database file under ``/raw`` path and refer to https://github.com/thinkwee/DDR_Bench for running the agent.
提供机构:
thinkwee
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作