yuyang/distil_cnndm
收藏Hugging Face2023-05-14 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/yuyang/distil_cnndm
下载链接
链接失效反馈官方服务:
资源简介:
# Distilled CNN/DailyMail Dataset
This folder contains the distilled data and dataset loading script to build a dataset on top of it.
- `cnn_bart_pl` is downloaded from [Saved Pseudo-Labels](https://github.com/huggingface/transformers/blob/main/examples/research_projects/seq2seq-distillation/precomputed_pseudo_labels.md), which is generated by facebook/bart-large-cnn, this corresponds to version "1.0.0". It contains train/validataion/test splits.
- `pegasus_cnn_cnn_pls` is also downloaded from [Saved Pseudo-Labels](https://github.com/huggingface/transformers/blob/main/examples/research_projects/seq2seq-distillation/precomputed_pseudo_labels.md). It is generated by sshleifer/pegasus-cnn-ft-v2, and it corresponds to version "2.0.0". It only includes the train split.
## Updates
- 03/16/2023
1. Remove "(CNN)" in the beginning of articles.
提供机构:
yuyang
原始信息汇总
Distilled CNN/DailyMail Dataset 概述
数据集组成
-
cnn_bart_pl:- 来源: Saved Pseudo-Labels
- 生成模型:
facebook/bart-large-cnn - 版本: "1.0.0"
- 包含内容: 训练集、验证集、测试集
-
pegasus_cnn_cnn_pls:- 来源: Saved Pseudo-Labels
- 生成模型:
sshleifer/pegasus-cnn-ft-v2 - 版本: "2.0.0"
- 包含内容: 训练集
更新记录
- 2023年3月16日
- 移除了文章开头的"(CNN)"字样。



