five

LLukas22/nq-simplified

收藏
Hugging Face2023-04-30 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/LLukas22/nq-simplified
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-sa-3.0 task_categories: - question-answering - sentence-similarity - feature-extraction language: - en --- # Dataset Card for "nq" ## Table of Contents - [Table of Contents](#table-of-contents) - [Dataset Description](#dataset-description) - [Dataset Summary](#dataset-summary) - [Dataset Structure](#dataset-structure) - [Data Instances](#data-instances) - [Data Fields](#data-fields) - [Additional Information](#additional-information) - [Licensing Information](#licensing-information) ## Dataset Description - **Homepage:** [https://ai.google.com/research/NaturalQuestions](https://ai.google.com/research/NaturalQuestions) ### Dataset Summary This is a modified version of the original Natural Questions (nq) dataset for qa tasks. The original is availabe [here](https://ai.google.com/research/NaturalQuestions). Each sample was preprocessed into a squadlike format. The context was shortened from an entire wikipedia article into the passage containing the answer. ## Dataset Structure ### Data Instances An example of 'train' looks as follows. ```json { "context": "The 2017 Major League Baseball All - Star Game was the 88th edition of the Major League Baseball All Star Game. The game was", "question": "where is the 2017 baseball all-star game being played", "answers": { "text":["Marlins Park"], "answer_start":[171] } } ``` ### Data Fields The data fields are the same among all splits. - `question`: a `string` feature. - `context`: a `string` feature. - `answers`: a dictionary feature containing: - `text`: a `string` feature. - `answer_start`: a `int32` feature. ## Additional Information ### Licensing Information This dataset is distributed under the cc-by-sa-3.0 license.
提供机构:
LLukas22
原始信息汇总

数据集概述

数据集描述

数据集总结

  • 本数据集是原始Natural Questions(nq)数据集的修改版本,专为问答任务设计。
  • 每个样本已预处理为类似SQuAD的格式,将包含答案的段落从完整的维基百科文章中提取出来。

数据集结构

数据实例

  • 示例数据结构如下: json { "context": "...", "question": "...", "answers": { "text": ["..."], "answer_start": [...] } }

数据字段

  • question: 字符串类型。
  • context: 字符串类型。
  • answers: 字典类型,包含:
    • text: 字符串类型。
    • answer_start: 整数类型。

附加信息

许可信息

  • 本数据集根据cc-by-sa-3.0许可发布。
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作