WITS Dataset
收藏paperswithcode.com2025-03-25 收录
下载链接:
https://paperswithcode.com/dataset/wits
下载链接
链接失效反馈官方服务:
资源简介:
This dataset is an extension of MASAC, a multimodal, multi-party, Hindi-English code-mixed dialogue dataset compiled from the popular Indian TV show,
‘Sarabhai v/s Sarabhai’. WITS was created by augmenting MASAC with natural language explanations for each sarcastic dialogue. The dataset consists of the transcribed sarcastic dialogues from 55 episodes of the TV show, along with audio and video multimodal signals. It was designed to facilitate Sarcasm Explanation in Dialogue (SED), a novel task aimed at generating a natural language explanation for a given sarcastic dialogue, that spells out the intended irony. Each data instance in WITS is associated with a corresponding video, audio, and textual transcript where the last utterance is sarcastic in nature. All the final selected explanations contain the following attributes:
• Sarcasm source: The speaker in the dialog who is being sarcastic.
• Sarcasm target: The person/ thing towards whom the sarcasm is directed.
• Action word: Verb/ action used to describe how the sarcasm is taking place. e.g. mocking, insults, taunts, etc.
• Description: A description of the scene to help contextualize the sarcasm.
本数据集为MASAC的扩展,MASAC是一个多模态、多参与方的印地语-英语代码混合对话数据集,该数据集源于广受欢迎的印度电视节目《Sarabhai v/s Sarabhai》。WITS通过为每个讽刺对话添加自然语言解释而创建。数据集包含来自该电视节目55集的讽刺对话转录本,以及音频和视频多模态信号。其设计宗旨在于促进对话中的讽刺解释(SED)这一新颖任务的实现,该任务旨在为给定的讽刺对话生成自然语言解释,以揭示其意图中的讽刺意味。WITS中的每个数据实例都与相应的视频、音频和文本转录本关联,其中最后一个发言具有讽刺性质。所有最终选定的解释均包含以下属性:
• 讽刺来源:对话中发表讽刺言论的说话者。
• 讽刺目标:讽刺所针对的人或事物。
• 动作词:用于描述讽刺发生方式的动词/动作,例如嘲讽、侮辱、嘲笑等。
• 描述:对场景的描述,有助于阐释讽刺的语境。
提供机构:
paperswithcode.com



