five

Humicroedit

收藏
arXiv2025-09-30 收录
下载链接:
https://zenodo.org/record/3969509#.XyWh6fhKh24
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集名为Humicroedit,包含了大约5,000条原始新闻标题,每条标题都有三个经过修改、可能带有幽默感的版本,总计15,095条编辑过的标题。这些原始标题来源于Reddit,并涵盖了来自主要英文新闻源的新闻标题。此外,该数据集由五位评委对幽默程度进行了评分,并分为训练集(64%)、开发集(16%)和测试集(20%)。规模上,共有15,095条编辑过的新闻标题。该数据集的任务是评估编辑过的新闻标题中的幽默程度。

This dataset is named Humicroedit. It contains approximately 5,000 original news headlines, each paired with three modified versions with humorous potential, totaling 15,095 edited news headlines. These original headlines are sourced from Reddit and cover news content from major English-language news outlets. Additionally, humor levels of the edited headlines were rated by five judges, and the dataset is split into a training set (64%), a development set (16%), and a test set (20%). In terms of scale, the dataset contains 15,095 edited news headlines. The core task of this dataset is to evaluate the humor level of edited news headlines.
提供机构:
Amazon Mechanical Turk
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作