CRD3 (Critical Role Dungeons and Dragons Dataset)
收藏OpenDataLab2026-05-31 更新2024-05-09 收录
下载链接:
https://opendatalab.org.cn/OpenDataLab/CRD3
下载链接
链接失效反馈官方服务:
资源简介:
本文介绍了关键角色龙与地下城数据集 (CRD3) 和相关分析。关键角色是一个无脚本的现场直播节目,其中固定的一群人玩《龙与地下城》,这是一个开放式的角色扮演游戏。该数据集是从转录为文本对话的159关键角色情节收集的,包括398,682转折。它还包括从Fandom wiki收集的相应的抽象摘要。该数据集在语言上是独特的,因为叙述完全是通过玩家协作和口头互动生成的。对于每个对话,都有大量的转弯,具有不同详细程度的多个抽象摘要以及与先前对话的语义联系。此外,我们提供了一种数据增强方法,该方法产生34,243摘要对话块对以支持当前的神经ML方法,并且我们提供了抽象摘要基准和评估。
This paper introduces the Critical Role Dungeons & Dragons Dataset (CRD3) and its associated analyses. Critical Role is an unscripted live broadcast program where a fixed cohort of players participates in Dungeons & Dragons, an open-ended tabletop role-playing game. This dataset is collected from 159 episodes of Critical Role transcribed into textual dialogues, containing 398,682 conversational turns. It also includes corresponding abstract summaries sourced from the Fandom wiki. This dataset is linguistically unique, as the narrative is entirely generated through player collaboration and verbal interaction. For each dialogue segment, there are numerous conversational turns, paired with multiple abstract summaries of varying levels of detail, as well as semantic connections to prior dialogues. Additionally, we present a data augmentation method that generates 34,243 summary-dialogue chunk pairs to support contemporary neural machine learning (ML) approaches, and we provide abstract summarization benchmarks and evaluations.
提供机构:
OpenDataLab
创建时间:
2022-06-23
搜集汇总
数据集介绍

背景与挑战
背景概述
CRD3数据集基于《关键角色》直播节目的159集转录文本构建,包含39.8万次对话转折及对应的抽象摘要,其语言由玩家协作生成。该数据集提供了数据增强方法以支持神经模型,并设定了抽象摘要的评估基准。
以上内容由遇见数据集搜集并总结生成



