Isotonic/DialogSumm
收藏Hugging Face2024-02-10 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/Isotonic/DialogSumm
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
license: cc-by-nc-sa-4.0
size_categories:
- 10K<n<100K
task_categories:
- summarization
- text-generation
- text2text-generation
dataset_info:
features:
- name: dialogue
dtype: string
- name: summary
dtype: string
splits:
- name: train
num_bytes: 48177311.0
num_examples: 52480
download_size: 29232356
dataset_size: 48177311.0
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
---
## DialogSumm
DialogSumm is a mixture of the following dialog datasets:
- [dialogsum](https://huggingface.co/datasets/knkarthick/dialogsum)
- [samsum](https://huggingface.co/datasets/samsum)
- [MocktaiLEngineer/qmsum-processed](https://huggingface.co/datasets/MocktaiLEngineer/qmsum-processed)
- [npc-engine/light-batch-summarize-dialogue](https://huggingface.co/datasets/npc-engine/light-batch-summarize-dialogue)
## 💻 Usage
```
from datasets import load_dataset
dataset = load_dataset("Isotonic/DialogSumm")
```
🚀🚀 Next: DialogSumm + [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) + [mediasum](https://huggingface.co/datasets/ccdv/mediasum) + [EdinburghNLP/xsum](https://huggingface.co/datasets/EdinburghNLP/xsum)
提供机构:
Isotonic
原始信息汇总
数据集概述
基本信息
- 语言: 英语
- 许可证: CC BY-NC-SA 4.0
- 数据规模: 10K<n<100K
任务类别
- 摘要生成
- 文本生成
- 文本到文本生成
数据集结构
- 特征:
- 对话: 字符串类型
- 摘要: 字符串类型
- 分割:
- 训练集:
- 字节数: 48177311.0
- 样本数: 52480
- 训练集:
- 下载大小: 29232356
- 数据集大小: 48177311.0
配置
- 默认配置:
- 数据文件:
- 训练集: data/train-*
- 数据文件:
数据来源
- DialogSumm: 包含以下对话数据集的混合:
- dialogsum
- samsum
- MocktaiLEngineer/qmsum-processed
- npc-engine/light-batch-summarize-dialogue



