five

Dolphin

收藏
arXiv2023-10-25 更新2024-06-21 收录
下载链接:
https://dolphin.dlnlp.ai/
下载链接
链接失效反馈
官方服务:
资源简介:
Dolphin数据集是由不列颠哥伦比亚大学的深度学习和自然语言处理组创建的,旨在为阿拉伯语自然语言生成(NLG)提供一个全面的评估框架。该数据集包含40个公共数据集,涵盖13种不同的NLG任务,如对话生成、问答、机器翻译和摘要等,总计50个测试分割。Dolphin数据集精心策划,以反映现实世界的场景和阿拉伯语的语言多样性,为评估阿拉伯语和多语言模型的性能和泛化能力设定了新标准。此外,Dolphin数据集的应用领域广泛,包括娱乐、教育和健康等,旨在推动当前方法的边界,促进健康竞争和研究透明度。

The Dolphin Dataset was created by the Deep Learning and Natural Language Processing Group at the University of British Columbia, aiming to provide a comprehensive evaluation framework for Arabic natural language generation (NLG). This dataset includes 40 public datasets covering 13 distinct NLG tasks such as dialogue generation, question answering, machine translation and summarization, with a total of 50 test splits. The Dolphin Dataset has been meticulously curated to reflect real-world scenarios and the linguistic diversity of Arabic, setting a new benchmark for evaluating the performance and generalization capabilities of Arabic and multilingual models. Additionally, the Dolphin Dataset covers a wide range of application domains including entertainment, education and healthcare, with the goal of pushing the boundaries of current methods, fostering healthy competition and promoting research transparency.
提供机构:
深度学习和自然语言处理组,不列颠哥伦比亚大学
创建时间:
2023-05-24
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作