Chinese Humor Dataset

Name: Chinese Humor Dataset
Creator: Authors of the paper
License: 暂无描述

arXiv2025-09-30 收录

下载链接：

https://github.com/liuhuanyong/ChineseHumorSentiment

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集旨在评估预训练语言模型（PLMs）对幽默理解能力的全面评估，包含四个子数据集，分别用于不同的与幽默相关的任务。这些子数据集涵盖了来自不同来源的幽默文本、从平台上搜集的非幽默文本，并且所有数据都经过了人工标注以确保准确性。这是一个大规模的数据集，其任务包括幽默识别、幽默类型分类、幽默程度分类以及笑点检测。

This dataset is designed to conduct a comprehensive evaluation of the humor comprehension capabilities of pre-trained language models (PLMs). It consists of four sub-datasets, each tailored for distinct humor-related tasks. These sub-datasets cover humorous texts from various sources and non-humorous texts collected from online platforms, and all data has undergone manual annotation to ensure accuracy. As a large-scale dataset, it includes four tasks: humor recognition, humor type classification, humor intensity classification, and punchline detection.

提供机构：

Authors of the paper

5,000+

优质数据集

54 个

任务类型

进入经典数据集