community-datasets/doqa
收藏数据集卡片 for "doqa"
数据集结构
数据实例
cooking
- 下载的数据文件大小: 4.19 MB
- 生成的数据集大小: 11.31 MB
- 总磁盘使用量: 15.51 MB
训练集示例: json { "answers": { "answer_start": [852], "text": ["CANNOTANSWER"] }, "background": ""So, over mixing batter forms gluten, which in turn hardens the cake. Fine.The problem is that I dont want lumps in the cakes, ...", "context": ""Milk wont help you - its mostly water, and gluten develops from flour (more accurately, specific proteins in flour) and water...", "followup": "n", "id": "C_64ce44d5f14347f488eb04b50387f022_q#2", "orig_answer": { "answer_start": [852], "text": ["CANNOTANSWER"] }, "question": "Ok. What can I add to make it more softer and avoid hardening?", "title": "What to add to the batter of the cake to avoid hardening when the gluten formation cant be avoided?", "yesno": "x" }
movies
- 下载的数据文件大小: 4.19 MB
- 生成的数据集大小: 3.17 MB
- 总磁盘使用量: 7.36 MB
测试集示例: json { "answers": { "answer_start": [852], "text": ["CANNOTANSWER"] }, "background": ""So, over mixing batter forms gluten, which in turn hardens the cake. Fine.The problem is that I dont want lumps in the cakes, ...", "context": ""Milk wont help you - its mostly water, and gluten develops from flour (more accurately, specific proteins in flour) and water...", "followup": "n", "id": "C_64ce44d5f14347f488eb04b50387f022_q#2", "orig_answer": { "answer_start": [852], "text": ["CANNOTANSWER"] }, "question": "Ok. What can I add to make it more softer and avoid hardening?", "title": "What to add to the batter of the cake to avoid hardening when the gluten formation cant be avoided?", "yesno": "x" }
travel
- 下载的数据文件大小: 4.19 MB
- 生成的数据集大小: 3.22 MB
- 总磁盘使用量: 7.41 MB
测试集示例: json { "answers": { "answer_start": [852], "text": ["CANNOTANSWER"] }, "background": ""So, over mixing batter forms gluten, which in turn hardens the cake. Fine.The problem is that I dont want lumps in the cakes, ...", "context": ""Milk wont help you - its mostly water, and gluten develops from flour (more accurately, specific proteins in flour) and water...", "followup": "n", "id": "C_64ce44d5f14347f488eb04b50387f022_q#2", "orig_answer": { "answer_start": [852], "text": ["CANNOTANSWER"] }, "question": "Ok. What can I add to make it more softer and avoid hardening?", "title": "What to add to the batter of the cake to avoid hardening when the gluten formation cant be avoided?", "yesno": "x" }
数据字段
所有拆分的数据字段相同。
cooking
title: 字符串特征。background: 字符串特征。context: 字符串特征。question: 字符串特征。id: 字符串特征。answers: 包含以下字段的字典特征:text: 字符串特征。answer_start: 整数特征。
followup: 字符串特征。yesno: 字符串特征。orig_answer: 包含以下字段的字典特征:text: 字符串特征。answer_start: 整数特征。
movies
title: 字符串特征。background: 字符串特征。context: 字符串特征。question: 字符串特征。id: 字符串特征。answers: 包含以下字段的字典特征:text: 字符串特征。answer_start: 整数特征。
followup: 字符串特征。yesno: 字符串特征。orig_answer: 包含以下字段的字典特征:text: 字符串特征。answer_start: 整数特征。
travel
title: 字符串特征。background: 字符串特征。context: 字符串特征。question: 字符串特征。id: 字符串特征。answers: 包含以下字段的字典特征:text: 字符串特征。answer_start: 整数特征。
followup: 字符串特征。yesno: 字符串特征。orig_answer: 包含以下字段的字典特征:text: 字符串特征。answer_start: 整数特征。
数据拆分
cooking
| train | validation | test | |
|---|---|---|---|
| cooking | 4612 | 911 | 1797 |
movies
| test | |
|---|---|
| movies | 1884 |
travel
| test | |
|---|---|
| travel | 1713 |




