five

abaki/autotrain-data-testproject

收藏
Hugging Face2023-07-20 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/abaki/autotrain-data-testproject
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en --- # AutoTrain Dataset for project: testproject ## Dataset Description This dataset has been automatically processed by AutoTrain for project testproject. ### Languages The BCP-47 code for the dataset's language is en. ## Dataset Structure ### Data Instances A sample from this dataset looks as follows: ```json [ { "context": "", "question": "Identify which instrument is string or percussion: Vibraslap, Inanga", "answers.text": [ "Vibraslap is percussion, Inanga is string." ], "answers.answer_start": [ 0 ], "feat_category": [ "classification" ] }, { "context": "Crypto AG was a Swiss company specialising in communications and information security founded by Boris Hagelin in 1952. The company was secretly purchased for US $5.75 million and jointly owned by the American Central Intelligence Agency (CIA) and West German Federal Intelligence Service (BND) from 1970 until about 1993, with the CIA continuing as sole owner until about 2018. The mission of breaking encrypted communication using a secretly owned company was known as \"Operation Rubikon\". With headquarters in Steinhausen, the company was a long-established manufacturer of encryption machines and a wide variety of cipher devices.", "question": "Is data security an illusion?", "answers.text": [ "The long answer is yes." ], "answers.answer_start": [ 0 ], "feat_category": [ "summarization" ] } ] ``` ### Dataset Fields The dataset has the following fields (also called "features"): ```json { "context": "Value(dtype='string', id=None)", "question": "Value(dtype='string', id=None)", "answers.text": "Sequence(feature=Value(dtype='string', id=None), length=-1, id=None)", "answers.answer_start": "Sequence(feature=Value(dtype='int32', id=None), length=-1, id=None)", "feat_category": "Sequence(feature=Value(dtype='string', id=None), length=-1, id=None)" } ``` ### Dataset Splits This dataset is split into a train and validation split. The split sizes are as follow: | Split name | Num samples | | ------------ | ------------------- | | train | 15827 | | valid | 3957 |
提供机构:
abaki
原始信息汇总

AutoTrain Dataset for project: testproject

数据集描述

本数据集由AutoTrain自动处理,用于项目testproject。

语言

数据集的语言BCP-47代码为en。

数据集结构

数据实例

数据集样本示例如下:

json [ { "context": "", "question": "Identify which instrument is string or percussion: Vibraslap, Inanga", "answers.text": [ "Vibraslap is percussion, Inanga is string." ], "answers.answer_start": [ 0 ], "feat_category": [ "classification" ] }, { "context": "Crypto AG was a Swiss company specialising in communications and information security founded by Boris Hagelin in 1952. The company was secretly purchased for US $5.75 million and jointly owned by the American Central Intelligence Agency (CIA) and West German Federal Intelligence Service (BND) from 1970 until about 1993, with the CIA continuing as sole owner until about 2018. The mission of breaking encrypted communication using a secretly owned company was known as "Operation Rubikon". With headquarters in Steinhausen, the company was a long-established manufacturer of encryption machines and a wide variety of cipher devices.", "question": "Is data security an illusion?", "answers.text": [ "The long answer is yes." ], "answers.answer_start": [ 0 ], "feat_category": [ "summarization" ] } ]

数据集字段

数据集包含以下字段:

json { "context": "Value(dtype=string, id=None)", "question": "Value(dtype=string, id=None)", "answers.text": "Sequence(feature=Value(dtype=string, id=None), length=-1, id=None)", "answers.answer_start": "Sequence(feature=Value(dtype=int32, id=None), length=-1, id=None)", "feat_category": "Sequence(feature=Value(dtype=string, id=None), length=-1, id=None)" }

数据集分割

数据集分为训练集和验证集,分割大小如下:

分割名称 样本数量
训练集 15827
验证集 3957
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作