five

finbenchv2-squad_v2-fi-mt

收藏
魔搭社区2025-08-15 更新2025-08-09 收录
下载链接:
https://modelscope.cn/datasets/TurkuNLP/finbenchv2-squad_v2-fi-mt
下载链接
链接失效反馈
官方服务:
资源简介:
### Dataset Summary This is a Finnish SQuAD question answering dataset used in Finbench version 2. It is a DeepL -based machine translation of the English SQuAD2.0 dataset which combines the 100,000 questions in SQuAD1.1 with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuAD2.0, systems must not only answer questions when possible, but also determine when no answer is supported by the paragraph and abstain from answering. ### Considerations for Using the Data Due to DeepL terms and conditions, this dataset **must not be used for any machine translation work**, namely machine translation system development and evaluation of any kind. In general, we wish you do not pair the original English data with the translations except when working on research unrelated to machine translation, so as not to infringe on the terms and conditions. ### Licensing Information Contents of this repository are distributed under the [Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA 4.0)](https://creativecommons.org/licenses/by-sa/4.0/). Copyright of the dataset contents belongs to the original copyright holders.

### 数据集概述 本数据集为适配Finbench v2版本的芬兰语版斯坦福问答数据集(SQuAD),其基于DeepL翻译引擎,由英文SQuAD2.0数据集机器翻译所得。SQuAD2.0数据集整合了SQuAD1.1中的10万个问题,以及由众包工作者以对抗性方式编写的超5万个看似可回答但实际无答案的问题。要在SQuAD2.0任务中取得优异表现,问答系统不仅需在可行时回答问题,还需判断段落中是否不存在可支撑答案的内容,并主动放弃作答。 ### 数据使用注意事项 鉴于DeepL的使用条款,本数据集**严禁用于任何机器翻译相关工作**,即各类机器翻译系统的开发与评估。一般而言,除非开展与机器翻译无关的研究,否则我们建议您请勿将原始英文数据与本翻译数据集进行配对使用,以免违反DeepL的使用条款。 ### 许可信息 本仓库中的内容采用[知识共享署名-相同方式共享4.0国际许可协议(CC BY-SA 4.0)](https://creativecommons.org/licenses/by-sa/4.0/)进行分发。数据集内容的版权归原版权所有者所有。
提供机构:
maas
创建时间:
2025-08-08
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作