finbenchv2-squad_v2-fi-mt
收藏魔搭社区2025-08-15 更新2025-08-09 收录
下载链接:
https://modelscope.cn/datasets/TurkuNLP/finbenchv2-squad_v2-fi-mt
下载链接
链接失效反馈官方服务:
资源简介:
### Dataset Summary
This is a Finnish SQuAD question answering dataset used in Finbench version 2. It is a DeepL -based machine translation of the English SQuAD2.0 dataset which combines the 100,000 questions in
SQuAD1.1 with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones.
To do well on SQuAD2.0, systems must not only answer questions when possible, but also determine when no answer is supported
by the paragraph and abstain from answering.
### Considerations for Using the Data
Due to DeepL terms and conditions, this dataset **must not be used for any machine translation work**, namely machine translation
system development and evaluation of any kind. In general, we wish you do not pair the original English data with the translations
except when working on research unrelated to machine translation, so as not to infringe on the terms and conditions.
### Licensing Information
Contents of this repository are distributed under the
[Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA 4.0)](https://creativecommons.org/licenses/by-sa/4.0/).
Copyright of the dataset contents belongs to the original copyright holders.
### 数据集概述
本数据集为适配Finbench v2版本的芬兰语版斯坦福问答数据集(SQuAD),其基于DeepL翻译引擎,由英文SQuAD2.0数据集机器翻译所得。SQuAD2.0数据集整合了SQuAD1.1中的10万个问题,以及由众包工作者以对抗性方式编写的超5万个看似可回答但实际无答案的问题。要在SQuAD2.0任务中取得优异表现,问答系统不仅需在可行时回答问题,还需判断段落中是否不存在可支撑答案的内容,并主动放弃作答。
### 数据使用注意事项
鉴于DeepL的使用条款,本数据集**严禁用于任何机器翻译相关工作**,即各类机器翻译系统的开发与评估。一般而言,除非开展与机器翻译无关的研究,否则我们建议您请勿将原始英文数据与本翻译数据集进行配对使用,以免违反DeepL的使用条款。
### 许可信息
本仓库中的内容采用[知识共享署名-相同方式共享4.0国际许可协议(CC BY-SA 4.0)](https://creativecommons.org/licenses/by-sa/4.0/)进行分发。数据集内容的版权归原版权所有者所有。
提供机构:
maas
创建时间:
2025-08-08



