Chinese Humor Dataset
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/liuhuanyong/ChineseHumorSentiment
下载链接
链接失效反馈官方服务:
资源简介:
该数据集旨在评估预训练语言模型(PLMs)对幽默理解能力的全面评估,包含四个子数据集,分别用于不同的与幽默相关的任务。这些子数据集涵盖了来自不同来源的幽默文本、从平台上搜集的非幽默文本,并且所有数据都经过了人工标注以确保准确性。这是一个大规模的数据集,其任务包括幽默识别、幽默类型分类、幽默程度分类以及笑点检测。
This dataset is designed to conduct a comprehensive evaluation of the humor comprehension capabilities of pre-trained language models (PLMs). It consists of four sub-datasets, each tailored for distinct humor-related tasks. These sub-datasets cover humorous texts from various sources and non-humorous texts collected from online platforms, and all data has undergone manual annotation to ensure accuracy. As a large-scale dataset, it includes four tasks: humor recognition, humor type classification, humor intensity classification, and punchline detection.
提供机构:
Authors of the paper



