ConvSumX
收藏arXiv2023-07-09 更新2024-06-21 收录
下载链接:
https://github.com/cylnlp/ConvSumX
下载链接
链接失效反馈官方服务:
资源简介:
ConvSumX是一个跨语言对话摘要基准,由四川蓝桥信息技术有限公司提出,旨在通过考虑源输入上下文的新注释方案,提高跨语言摘要的忠实度。该数据集包含两个子任务,分别基于日常对话摘要和基于查询的摘要,覆盖英语到中文、法语和乌克兰语的三种语言方向。数据集的创建过程中,注释者需同时考虑源语言文本和预注释摘要,确保跨语言一致性、语言风格和术语的准确性。ConvSumX的应用领域包括学术研究和工业实践,特别是在处理多语言和低资源语言的对话摘要任务中展现出其价值。
ConvSumX is a cross-lingual dialogue summarization benchmark proposed by Sichuan Lanqiao Information Technology Co., Ltd., which aims to improve the faithfulness of cross-lingual summarization via a novel annotation scheme that considers the context of source inputs. This dataset contains two subtasks, namely daily dialogue summarization and query-based summarization, covering three language directions from English to Chinese, French and Ukrainian. During the dataset creation process, annotators are required to take both the source language text and pre-annotated summaries into consideration to ensure cross-lingual consistency, linguistic style and terminological accuracy. The application fields of ConvSumX include academic research and industrial practice, and it particularly demonstrates its value in handling cross-lingual and low-resource language dialogue summarization tasks.
提供机构:
四川蓝桥信息技术有限公司
创建时间:
2023-07-09



