SEAHORSE
收藏arXiv2023-11-02 更新2024-06-21 收录
下载链接:
https://goo.gle/seahorse
下载链接
链接失效反馈官方服务:
资源简介:
SEAHORSE数据集由谷歌深度思维创建,包含96,645条多语言摘要,旨在评估多方面摘要系统的性能。该数据集覆盖6种语言,涉及9个系统,包括参考文本,并从4个不同的摘要数据集中提取。数据集通过6个维度(可理解性、重复性、语法、归属、主要思想和简洁性)进行人工评分,以支持训练和评估学习型度量标准。SEAHORSE不仅作为基准用于评估学习型度量,还作为大规模资源用于训练此类度量,旨在解决自动摘要系统评估的挑战,特别是在多语言环境下的评估问题。
The SEAHORSE dataset was developed by Google DeepMind, which houses 96,645 multilingual summaries intended to assess the performance of multi-faceted summarization systems. Covering 6 languages, the dataset comprises summaries from 9 systems along with reference texts, and is extracted from 4 distinct summarization datasets. Manually annotated across 6 dimensions: comprehensibility, repetitiveness, grammar, attribution, main ideas, and conciseness, the dataset supports the training and evaluation of learned metrics. SEAHORSE not only serves as a benchmark for evaluating learned metrics but also acts as a large-scale resource for training such metrics, aiming to address the challenges in evaluating automatic summarization systems, particularly in multilingual settings.
提供机构:
谷歌深度思维
创建时间:
2023-05-23



