STAC Corpus
收藏arXiv2025-09-30 收录
下载链接:
https://www.irit.fr/STAC/corpus.html
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个从在线游戏中收集的多方对话语料库,其注释遵循分段话语表示理论(SDRT)。该数据集的规模包括训练数据中的1,062个对话、11,711个话语单元(EDUs)以及11,350个关系,测试数据中的111个对话、1,156个话语单元和1,126个关系。此外,该任务排除了复杂话语单元(CDUs),专注于话语依赖解析。
This dataset is a multi-party conversational corpus collected from online games, annotated following Segmented Discourse Representation Theory (SDRT). It contains 1,062 dialogues, 11,711 elementary discourse units (EDUs), and 11,350 discourse relations in the training split, while the test split includes 111 dialogues, 1,156 EDUs, and 1,126 discourse relations. Furthermore, this task excludes complex discourse units (CDUs) and focuses on discourse dependency parsing.



