five

Crowd-Annotated Spanish Corpus for Humor Analysis

收藏
arXiv2018-07-19 更新2024-06-21 收录
下载链接:
https://pln-fing-udelar.github.io/humor
下载链接
链接失效反馈
官方服务:
资源简介:
本数据集名为‘Crowd-Annotated Spanish Corpus for Humor Analysis’,由乌拉圭共和国大学的自然语言处理小组创建。数据集包含27,282条西班牙语推文,这些推文来自幽默和非幽默账户,每条推文平均获得约四个幽默值和幽默评分注释。数据集的创建过程涉及从选定账户和实时样本中提取推文,并通过众包网络任务进行注释。该数据集主要用于构建西班牙语幽默分类器,并作为研究幽默和幽默主观性的第一步。

This dataset, named *Crowd-Annotated Spanish Corpus for Humor Analysis*, was developed by the Natural Language Processing Group of the University of the Republic of Uruguay. The corpus comprises 27,282 Spanish tweets sourced from both humorous and non-humorous user accounts, with each tweet receiving an average of approximately four annotations for humor values and humor ratings. The construction of this dataset involved extracting tweets from pre-selected accounts and real-time samples, followed by annotation through crowdsourced web-based tasks. This corpus is primarily intended for building Spanish-language humor classifiers, and serves as the initial step for research on humor and humorous subjectivity.
提供机构:
自然语言处理小组,工程学院,乌拉圭共和国大学
创建时间:
2017-10-02
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作