ReJ dataset
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/Humor-Research/hri_tools
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是基于用户评分的Reddit笑话集合,旨在为幽默生成能力提供一个基准。该数据集经过筛选,仅包含根据用户点赞数量判断为高质量的笑话,并且还包含了一个清洁过程,以移除那些可能妨碍幽默理解的干扰因素。该数据集的规模为156个笑话,是从Reddit上抽取的子样本。其任务是评估幽默生成的质量。
This dataset is a curated collection of Reddit jokes based on user upvote counts, intended to serve as a benchmark for humor generation capabilities. It has been filtered to retain only high-quality jokes identified by user upvote metrics, and has undergone a cleaning pipeline to eliminate distracting elements that could hinder humor comprehension. Comprising 156 jokes in total, this is a subsample extracted from Reddit. The task enabled by this dataset is to evaluate the quality of humor generation.
提供机构:
ScaleAI



