five

XQuAD-ca

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/4526223
下载链接
链接失效反馈
官方服务:
资源简介:
If you use this resource in your work, please cite our latest paper: @inproceedings{armengol-estape-etal-2021-multilingual,     title = "Are Multilingual Models the Best Choice for Moderately Under-resourced Languages? {A} Comprehensive Assessment for {C}atalan",     author = "Armengol-Estap{\'e}, Jordi  and       Carrino, Casimiro Pio  and       Rodriguez-Penagos, Carlos  and       de Gibert Bonet, Ona  and       Armentano-Oller, Carme  and       Gonzalez-Agirre, Aitor  and       Melero, Maite  and       Villegas, Marta",     booktitle = "Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021",     month = aug,     year = "2021",     address = "Online",     publisher = "Association for Computational Linguistics",     url = "https://aclanthology.org/2021.findings-acl.437",     doi = "10.18653/v1/2021.findings-acl.437",     pages = "4933--4946", } Professional translation into Catalan of XQuAD dataset (https://github.com/deepmind/xquad). XQuAD (Cross-lingual Question Answering Dataset) is a benchmark dataset for evaluating cross-lingual question answering performance. The dataset consists of a subset of 240 paragraphs and 1190 question-answer pairs from the development set of SQuAD v1.1 (Rajpurkar et al., 2016) together with their professional translations into ten languages: Spanish, German, Greek, Russian, Turkish, Arabic, Vietnamese, Thai, Chinese, and Hindi. Rumanian was added later. We added the 13th language to the corpus using also professional native catalan translators. XQuAD and XQuAD-Ca datasets are released under CC-by-sa licence.
创建时间:
2022-06-20
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作