five

hamishivi/ROCStories

收藏
Hugging Face2026-04-19 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/hamishivi/ROCStories
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 task_categories: - text-generation language: - en size_categories: - 10K<n<100K --- # ROCStories (prompt / continuation) A reformatted version of the [ROCStories](https://cs.rochester.edu/nlp/rocstories/) corpus, suitable for open-ended story-generation exercises. Each example is a 5-sentence ROCStory, split into: | field | description | |---------------|------------------------------------------------------| | `prompt` | the first sentence of the story | | `continuation`| the remaining four sentences | | `text` | the full (unmodified) 5-sentence story | ## Splits | split | rows | |--------------|---------| | `train` | 70,676 | | `validation` | 7,852 | | `test` | 19,633 | The `train` / `validation` split is a 90/10 shuffle (seed = 42) of the train split from [mintujupally/ROCStories](https://huggingface.co/datasets/mintujupally/ROCStories); `test` is mintujupally's test split unchanged. ## Provenance This dataset was reconstructed to replace `Ximing/ROCStories`, which is no longer available on the Hub, as the data source for the CSED503 / CSE447 Natural Language Generation assignment. The underlying stories come from the ROC Stories 2016 and 2017 releases (Mostafazadeh et al.). Please cite the original corpus when using this data: ``` @inproceedings{mostafazadeh2016corpus, title = {A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories}, author = {Mostafazadeh, Nasrin and Chambers, Nathanael and He, Xiaodong and Parikh, Devi and Batra, Dhruv and Vanderwende, Lucy and Kohli, Pushmeet and Allen, James}, booktitle = {Proceedings of NAACL-HLT}, year = {2016} } ```
提供机构:
hamishivi
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作