SPR
收藏OpenDataLab2026-05-17 更新2024-05-09 收录
下载链接:
https://opendatalab.org.cn/OpenDataLab/SPR
下载链接
链接失效反馈官方服务:
资源简介:
来源: 第二届 “小牛杯” 幽默计算-情景喜剧笑话识别
幽默是一种特殊的语言表达方式,在日常生活中起着解决尴尬,活跃气氛,促进交流的重要作用。幽默计算是近年来自然语言处理领域的新兴热点之一。主要研究如何基于计算机技术对幽默进行识别、分类和生成,具有重要的理论和应用价值。
幽默的产生通常取决于上下文信息。例如,对话中的幽默通常需要一个预示的过程。与单句幽默相比,这种幽默类型的识别更为复杂和困难。在情景喜剧中,演讲的一部分承担了使人们发笑的角色,通常称为punchline (英语中的Punchline)。这个幽默计算任务以情景喜剧为载体,需要从剧情的对话中识别笑话。
评价选取「我爱我家」为资料来源。根据场景和情节的变化,情景喜剧分为几个对话 (Dialogue)。在对话中,有不同的角色进行交流,从而导致持续的对话 (话语)。同一对话中的对话按顺序出现,并且存在上下文关系。与单句幽默相比,对话中的幽默可能来自语境而非对话本身的内容。要结合语境内容判断对话是否幽默,识别情景喜剧中的笑话。
Source: The 2nd "Calf Cup" Humor Computing - Sitcom Joke Recognition Task
Humor is a unique form of linguistic expression that plays a crucial role in resolving awkward situations, livening up atmospheres, and promoting interpersonal communication in daily life. Humor computing has emerged as one of the burgeoning research hotspots in the field of natural language processing in recent years. It mainly investigates how to leverage computer technologies to recognize, classify, and generate humor, possessing significant theoretical and practical values.
The emergence of humor usually relies on contextual information. For instance, humor in dialogues typically requires a foreshadowing process. Compared with single-sentence humor, the recognition of this type of humor is more complex and challenging. In sitcoms, the segment of dialogue that elicits audience laughter is commonly referred to as the punchline. This humor computing task takes sitcoms as the research subject, requiring the identification of jokes from the dialogues within the plot.
For evaluation, *I Love My Family* is selected as the data source. Based on changes in scenes and plot developments, a sitcom is divided into multiple dialogues (Dialogue). Within each dialogue, different characters interact to generate continuous utterances. The utterances within the same dialogue appear in sequence and have contextual connections. Compared with single-sentence humor, humor in dialogues may originate from the context rather than the content of the dialogue itself. It is necessary to combine contextual content to determine whether a dialogue is humorous, thereby identifying jokes in sitcoms.
提供机构:
OpenDataLab
创建时间:
2023-05-15
搜集汇总
数据集介绍

背景与挑战
背景概述
SPR数据集是一个用于幽默计算研究的数据集,特别关注情景喜剧中的笑话识别,基于《我爱我家》的对话内容,强调上下文在幽默识别中的重要性。
以上内容由遇见数据集搜集并总结生成



