hamishivi/ROCStories
收藏Hugging Face2026-04-19 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/hamishivi/ROCStories
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
task_categories:
- text-generation
language:
- en
size_categories:
- 10K<n<100K
---
# ROCStories (prompt / continuation)
A reformatted version of the [ROCStories](https://cs.rochester.edu/nlp/rocstories/)
corpus, suitable for open-ended story-generation exercises.
Each example is a 5-sentence ROCStory, split into:
| field | description |
|---------------|------------------------------------------------------|
| `prompt` | the first sentence of the story |
| `continuation`| the remaining four sentences |
| `text` | the full (unmodified) 5-sentence story |
## Splits
| split | rows |
|--------------|---------|
| `train` | 70,676 |
| `validation` | 7,852 |
| `test` | 19,633 |
The `train` / `validation` split is a 90/10 shuffle (seed = 42) of the train
split from [mintujupally/ROCStories](https://huggingface.co/datasets/mintujupally/ROCStories);
`test` is mintujupally's test split unchanged.
## Provenance
This dataset was reconstructed to replace
`Ximing/ROCStories`, which is no longer available on the Hub, as the data
source for the CSED503 / CSE447 Natural Language Generation assignment.
The underlying stories come from the ROC Stories 2016 and 2017 releases
(Mostafazadeh et al.). Please cite the original corpus when using this data:
```
@inproceedings{mostafazadeh2016corpus,
title = {A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories},
author = {Mostafazadeh, Nasrin and Chambers, Nathanael and He, Xiaodong and Parikh, Devi and Batra, Dhruv and Vanderwende, Lucy and Kohli, Pushmeet and Allen, James},
booktitle = {Proceedings of NAACL-HLT},
year = {2016}
}
```
提供机构:
hamishivi



