TUKE-DeutscheTelekom/skquad
收藏数据集卡片 for skquad
数据集描述
数据集摘要
SK-QuAD 是斯洛伐克语的第一个问答数据集。它是手动注释的,因此没有机器翻译引起的失真。数据集主题多样,与 SQuAD 不重叠,带来了新的知识。它通过了第二轮注释,每个问题和答案至少被两个注释者审核过。
支持的任务和排行榜
- 问答
- 文档检索
语言
- 斯洛伐克语
数据集结构
数据实例
squad_v2
- 下载的数据集文件大小: 44.34 MB
- 生成的数据集大小: 122.57 MB
- 磁盘总使用量: 166.91 MB
验证集示例: json { "answers": { "answer_start": [94, 87, 94, 94], "text": ["10th and 11th centuries", "in the 10th and 11th centuries", "10th and 11th centuries", "10th and 11th centuries"] }, "context": ""The Normans (Norman: Nourmands; French: Normands; Latin: Normanni) were the people who in the 10th and 11th centuries gave their name to Normandy, a region in France. They were descended from Norse ("Norman" comes from "Norseman") raiders and pirates from Denmark, Iceland, and Norway who, under their leader Rollo, agreed to swear fealty to King Charles III of West Francia. Through generations, they became descendants of both the Norse raiders and the Gallo-Romance inhabitants of the region they settled. Their chief stronghold was the fortified town of Rouen, which allowed them to control the lower Seine valley economically and militarily. They consolidated their French territories as the Duchy of Normandy in the 10th and 11th centuries, and adopted the Roman Catholic faith from the Frankish rulers."", "id": "56ddde6b9a695914005b9629", "question": "When were the Normans in Normandy?", "title": "Normans" }
数据字段
squad_v2
id: 字符串特征。title: 字符串特征。context: 字符串特征。question: 字符串特征。answers: 包含以下字段的字典特征:text: 字符串特征。answer_start: 整数特征。
数据分割
| Train | Dev | Translated | |
|---|---|---|---|
| Documents | 8,377 | 940 | 442 |
| Paragraphs | 22,062 | 2,568 | 18,931 |
| Questions | 81,582 | 9,583 | 120,239 |
| Answers | 65,839 | 7,822 | 79,978 |
| Unanswerable | 15,877 | 1,784 | 40,261 |
数据集创建
策划理由
[更多信息需要]
源数据
初始数据收集和规范化
[更多信息需要]
源语言生产者是谁?
[更多信息需要]
注释
注释过程
[更多信息需要]
注释者是谁?
[更多信息需要]
个人和敏感信息
[更多信息需要]
使用数据的注意事项
数据集的社会影响
[更多信息需要]
偏见的讨论
[更多信息需要]
其他已知限制
[更多信息需要]
附加信息
数据集策展人
- 德国电信系统解决方案斯洛伐克
- 科希策理工大学
许可信息
署名-相同方式共享 4.0 国际 (CC BY-SA 4.0)
引用信息
[更多信息需要]
贡献
感谢 @github-username 添加此数据集。



