yardeny/tokenized_bert_dataset
收藏Hugging Face2023-06-07 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/yardeny/tokenized_bert_dataset
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: input_ids
sequence: int32
- name: token_type_ids
sequence: int8
- name: attention_mask
sequence: int8
- name: special_tokens_mask
sequence: int8
splits:
- name: train
num_bytes: 23534799613
num_examples: 80462898
download_size: 7159489349
dataset_size: 23534799613
---
# Dataset Card for "tokenized_bert_dataset"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
yardeny
原始信息汇总
数据集概述
数据集名称
tokenized_bert_dataset
数据特征
- input_ids
类型:int32序列 - token_type_ids
类型:int8序列 - attention_mask
类型:int8序列 - special_tokens_mask
类型:int8序列
数据分割
- 训练集
大小:23534799613字节
示例数:80462898
下载大小
7159489349字节
数据集总大小
23534799613字节



