ARSynopsis/Combined_ROO_Liquidity_Dataset

Name: ARSynopsis/Combined_ROO_Liquidity_Dataset
Creator: ARSynopsis
Published: 2024-09-30 12:21:54
License: 暂无描述

Hugging Face2024-09-30 更新2025-11-01 收录

下载链接：

https://hf-mirror.com/datasets/ARSynopsis/Combined_ROO_Liquidity_Dataset

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: document dtype: string - name: summary dtype: string - name: source dtype: string - name: __index_level_0__ dtype: int64 splits: - name: train num_bytes: 1143746144 num_examples: 83254 - name: validation num_bytes: 142815263 num_examples: 10405 - name: test num_bytes: 143020108 num_examples: 10405 download_size: 637677002 dataset_size: 1429581515 configs: - config_name: default data_files: - split: train path: data/train-* - split: validation path: data/validation-* - split: test path: data/test-* task_categories: - summarization language: - en tags: - finance size_categories: - 100K<n<1M --- # Dataset Card for Dataset Name  This dataset is designed for text summarization tasks, specifically focusing on financial and liquidity data. It combines structured text from different segments of financial reports, allowing for both automatic and human evaluation in text summarization tasks.  ## Dataset Details This dataset was built using the dataset presented in the research paper "**Long Text and Multi-Table Summarization: Dataset and Method**". The dataset consists of financial documents with detailed reports and their corresponding summaries, which aim to condense lengthy documents into shorter, coherent summaries. Paper Reference: [Long Text and Multi-Table Summarization: Dataset and Method](https://arxiv.org/abs/2302.03815) ### Dataset Description **Dataset Structure** The dataset is divided into: - Train: The primary dataset for model training. - Validation: Used for validation during training. - Test: Used for final evaluation of the summarization models. Each entry consists of: - text: The full input document, which is around 2500 words in length. - summary: A condensed version of the document, around 350 words long.

提供机构：

ARSynopsis

5,000+

优质数据集

54 个

任务类型

进入经典数据集