CoQCat
收藏Zenodo2024-02-01 更新2026-05-26 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.10362295
下载链接
链接失效反馈官方服务:
资源简介:
CoQCat is a dataset for Conversational Question Answering in Catalan. It is based on CoQA dataset.
CoQCat comprises 89,364 question-answer pairs, sourced from conversations related to 6,000 text passages from six different domains.
The questions and responses are designed to maintain a conversational tone.
The answers are presented in a free-form text format, with evidence highlighted from the passage.
For the development and test sets, an additional 2 responses to each question have been collected.
This work is licensed under a CC BY-NC-ND 4.0 International License.
In this repository you'll find the following items:
dataset: folder with the dataset as published in HuggingFace
reports: folder with the reports done as feedback to the annotators during the annoation process
stats: folder with some statistics about each batch, taken into account to write the reports
coqcat_guidelines.pdf: the guidelines provided to the annotation team
This work was funded by the Departament de la Vicepresidència i de Polítiques Digitals i Territori de la Generalitat de Catalunya within the framework of Projecte AINA.
Thanks to M. Carme Marí and the VilaWeb team for allowing us to use their texts. And also to all the Catalan Wikipedia and Gutenberg Project volunteers all their work.
提供机构:
Zenodo
创建时间:
2023-12-12



