pratultandon/tokenized-recipe-nlg-gpt2-ingredients-to-recipe-end
收藏Hugging Face2022-12-06 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/pratultandon/tokenized-recipe-nlg-gpt2-ingredients-to-recipe-end
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: input_ids
sequence: int32
- name: attention_mask
sequence: int8
splits:
- name: train
num_bytes: 2217334238
num_examples: 2022671
- name: test
num_bytes: 116785866
num_examples: 106202
download_size: 749380879
dataset_size: 2334120104
---
# Dataset Card for "tokenized-recipe-nlg-gpt2-ingredients-to-recipe-end"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
数据集信息:
特征字段:
- 名称:输入标识序列(input_ids),序列类型:int32
- 名称:注意力掩码(attention_mask),序列类型:int8
数据集划分:
- 名称:训练集,占用字节数:2217334238,样本总数:2022671
- 名称:测试集,占用字节数:116785866,样本总数:106202
下载大小:749380879
总数据集大小:2334120104
# "tokenized-recipe-nlg-gpt2-ingredients-to-recipe-end" 数据集卡片
[需补充更多信息](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
pratultandon
原始信息汇总
数据集概述
数据集特征
-
input_ids
- 数据类型:int32
- 序列类型:sequence
-
attention_mask
- 数据类型:int8
- 序列类型:sequence
数据集划分
-
训练集 (train)
- 样本数量:2022671
- 存储大小:2217334238 字节
-
测试集 (test)
- 样本数量:106202
- 存储大小:116785866 字节
数据集大小
- 下载大小:749380879 字节
- 数据集总大小:2334120104 字节



