Reddit TIFU
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/ctr4si/mmn
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为“Reddit TIFU”,包含了12万个来自网络对话的帖子,旨在从帖子中创建一个“tldr”总结。此外,该数据集中的例子相对较短,每个帖子的平均字数为432字,而总结的平均字数为23字。规模上,该数据集包含了120K个帖子,任务类型为抽象文本摘要。
This dataset, named "Reddit TIFU", contains 120,000 posts sourced from online conversations, with the objective of generating "tldr" summaries from these posts. Additionally, the examples in this dataset are relatively short, with an average word count of 432 per post and 23 per summary. In terms of scale, the dataset includes 120K posts, and the task type is abstractive text summarization.
提供机构:
Reddit



