five

Aclairs/ALBERTFINALYEAR

收藏
Hugging Face2022-03-14 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/Aclairs/ALBERTFINALYEAR
下载链接
链接失效反馈
官方服务:
资源简介:
--- {} --- # AutoNLP Dataset for project: ALBERTFINALYEAR ## Table of content - [Dataset Description](#dataset-description) - [Languages](#languages) - [Dataset Structure](#dataset-structure) - [Data Instances](#data-instances) - [Data Fields](#data-fields) - [Data Splits](#data-splits) ## Dataset Descritpion This dataset has been automatically processed by AutoNLP for project ALBERTFINALYEAR. ### Languages The BCP-47 code for the dataset's language is unk. ## Dataset Structure ### Data Instances A sample from this dataset looks as follows: ```json [ { "context": "Hasidic or Chasidic Judaism overlaps significantly with Haredi Judaism in its engagement with the se[...]", "question": "What overlaps significantly with Haredi Judiasm?", "answers.text": [ "Chasidic Judaism" ], "answers.answer_start": [ 11 ] }, { "context": "Data compression can be viewed as a special case of data differencing: Data differencing consists of[...]", "question": "What can classified as data differencing with empty source data?", "answers.text": [ "Data compression", "data compression" ], "answers.answer_start": [ 0, 400 ] } ] ``` ### Dataset Fields The dataset has the following fields (also called "features"): ```json { "context": "Value(dtype='string', id=None)", "question": "Value(dtype='string', id=None)", "answers.text": "Sequence(feature=Value(dtype='string', id=None), length=-1, id=None)", "answers.answer_start": "Sequence(feature=Value(dtype='int32', id=None), length=-1, id=None)" } ``` ### Dataset Splits This dataset is split into a train and validation split. The split sizes are as follow: | Split name | Num samples | | ------------ | ------------------- | | train | 87433 | | valid | 10544 |
提供机构:
Aclairs
原始信息汇总

AutoNLP Dataset for project: ALBERTFINALYEAR

数据集描述

  • 语言: 数据集的语言标识符为unk。

数据集结构

数据实例

数据集中的样本示例如下: json [ { "context": "Hasidic or Chasidic Judaism overlaps significantly with Haredi Judaism in its engagement with the se[...]", "question": "What overlaps significantly with Haredi Judiasm?", "answers.text": [ "Chasidic Judaism" ], "answers.answer_start": [ 11 ] }, { "context": "Data compression can be viewed as a special case of data differencing: Data differencing consists of[...]", "question": "What can classified as data differencing with empty source data?", "answers.text": [ "Data compression", "data compression" ], "answers.answer_start": [ 0, 400 ] } ]

数据字段

数据集包含以下字段: json { "context": "Value(dtype=string, id=None)", "question": "Value(dtype=string, id=None)", "answers.text": "Sequence(feature=Value(dtype=string, id=None), length=-1, id=None)", "answers.answer_start": "Sequence(feature=Value(dtype=int32, id=None), length=-1, id=None)" }

数据分割

数据集分为训练集和验证集,分割大小如下:

分割名称 样本数量
训练集 87433
验证集 10544
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作