LLukas22/nq-simplified
收藏Hugging Face2023-04-30 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/LLukas22/nq-simplified
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-sa-3.0
task_categories:
- question-answering
- sentence-similarity
- feature-extraction
language:
- en
---
# Dataset Card for "nq"
## Table of Contents
- [Table of Contents](#table-of-contents)
- [Dataset Description](#dataset-description)
- [Dataset Summary](#dataset-summary)
- [Dataset Structure](#dataset-structure)
- [Data Instances](#data-instances)
- [Data Fields](#data-fields)
- [Additional Information](#additional-information)
- [Licensing Information](#licensing-information)
## Dataset Description
- **Homepage:** [https://ai.google.com/research/NaturalQuestions](https://ai.google.com/research/NaturalQuestions)
### Dataset Summary
This is a modified version of the original Natural Questions (nq) dataset for qa tasks. The original is availabe [here](https://ai.google.com/research/NaturalQuestions).
Each sample was preprocessed into a squadlike format. The context was shortened from an entire wikipedia article into the passage containing the answer.
## Dataset Structure
### Data Instances
An example of 'train' looks as follows.
```json
{
"context": "The 2017 Major League Baseball All - Star Game was the 88th edition of the Major League Baseball All Star Game. The game was",
"question": "where is the 2017 baseball all-star game being played",
"answers":
{
"text":["Marlins Park"],
"answer_start":[171]
}
}
```
### Data Fields
The data fields are the same among all splits.
- `question`: a `string` feature.
- `context`: a `string` feature.
- `answers`: a dictionary feature containing:
- `text`: a `string` feature.
- `answer_start`: a `int32` feature.
## Additional Information
### Licensing Information
This dataset is distributed under the cc-by-sa-3.0 license.
提供机构:
LLukas22
原始信息汇总
数据集概述
数据集描述
数据集总结
- 本数据集是原始Natural Questions(nq)数据集的修改版本,专为问答任务设计。
- 每个样本已预处理为类似SQuAD的格式,将包含答案的段落从完整的维基百科文章中提取出来。
数据集结构
数据实例
- 示例数据结构如下: json { "context": "...", "question": "...", "answers": { "text": ["..."], "answer_start": [...] } }
数据字段
question: 字符串类型。context: 字符串类型。answers: 字典类型,包含:text: 字符串类型。answer_start: 整数类型。
附加信息
许可信息
- 本数据集根据cc-by-sa-3.0许可发布。



