LLukas22/nq-simplified

Name: LLukas22/nq-simplified
Creator: LLukas22
Published: 2023-04-30 20:28:17
License: 暂无描述

Hugging Face2023-04-30 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/LLukas22/nq-simplified

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: cc-by-sa-3.0 task_categories: - question-answering - sentence-similarity - feature-extraction language: - en --- # Dataset Card for "nq" ## Table of Contents - [Table of Contents](#table-of-contents) - [Dataset Description](#dataset-description) - [Dataset Summary](#dataset-summary) - [Dataset Structure](#dataset-structure) - [Data Instances](#data-instances) - [Data Fields](#data-fields) - [Additional Information](#additional-information) - [Licensing Information](#licensing-information) ## Dataset Description - **Homepage:** [https://ai.google.com/research/NaturalQuestions](https://ai.google.com/research/NaturalQuestions) ### Dataset Summary This is a modified version of the original Natural Questions (nq) dataset for qa tasks. The original is availabe [here](https://ai.google.com/research/NaturalQuestions). Each sample was preprocessed into a squadlike format. The context was shortened from an entire wikipedia article into the passage containing the answer. ## Dataset Structure ### Data Instances An example of 'train' looks as follows. ```json { "context": "The 2017 Major League Baseball All - Star Game was the 88th edition of the Major League Baseball All Star Game. The game was", "question": "where is the 2017 baseball all-star game being played", "answers": { "text":["Marlins Park"], "answer_start":[171] } } ``` ### Data Fields The data fields are the same among all splits. - `question`: a `string` feature. - `context`: a `string` feature. - `answers`: a dictionary feature containing: - `text`: a `string` feature. - `answer_start`: a `int32` feature. ## Additional Information ### Licensing Information This dataset is distributed under the cc-by-sa-3.0 license.

提供机构：

LLukas22

原始信息汇总

数据集概述

数据集描述

数据集总结

本数据集是原始Natural Questions（nq）数据集的修改版本，专为问答任务设计。
每个样本已预处理为类似SQuAD的格式，将包含答案的段落从完整的维基百科文章中提取出来。

数据集结构

数据实例

示例数据结构如下： json { "context": "...", "question": "...", "answers": { "text": ["..."], "answer_start": [...] } }

数据字段

question: 字符串类型。
context: 字符串类型。
answers: 字典类型，包含：
- text: 字符串类型。
- answer_start: 整数类型。

附加信息

许可信息

本数据集根据cc-by-sa-3.0许可发布。

5,000+

优质数据集

54 个

任务类型

进入经典数据集