sogeeking/vqvae_token_big
收藏Hugging Face2023-12-05 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/sogeeking/vqvae_token_big
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: Burgers_Sols_Nu0.001
features:
- name: parameters
dtype: string
- name: input_ids
sequence: int32
- name: attention_mask
sequence: int8
- name: mean
sequence: float32
- name: std
sequence: float32
splits:
- name: train
num_bytes: 82800000
num_examples: 10000
download_size: 16936560
dataset_size: 82800000
- config_name: Burgers_Sols_Nu0.002
features:
- name: parameters
dtype: string
- name: input_ids
sequence: int32
- name: attention_mask
sequence: int8
- name: mean
sequence: float32
- name: std
sequence: float32
splits:
- name: train
num_bytes: 82800000
num_examples: 10000
download_size: 16914568
dataset_size: 82800000
- config_name: Burgers_Sols_Nu0.004
features:
- name: parameters
dtype: string
- name: input_ids
sequence: int32
- name: attention_mask
sequence: int8
- name: mean
sequence: float32
- name: std
sequence: float32
splits:
- name: train
num_bytes: 82800000
num_examples: 10000
download_size: 16907704
dataset_size: 82800000
- config_name: Burgers_Sols_Nu0.01
features:
- name: parameters
dtype: string
- name: input_ids
sequence: int32
- name: attention_mask
sequence: int8
- name: mean
sequence: float32
- name: std
sequence: float32
splits:
- name: train
num_bytes: 82790000
num_examples: 10000
download_size: 16920292
dataset_size: 82790000
- config_name: Burgers_Sols_Nu0.02
features:
- name: parameters
dtype: string
- name: input_ids
sequence: int32
- name: attention_mask
sequence: int8
- name: mean
sequence: float32
- name: std
sequence: float32
splits:
- name: train
num_bytes: 82790000
num_examples: 10000
download_size: 16932816
dataset_size: 82790000
- config_name: Burgers_Sols_Nu0.04
features:
- name: parameters
dtype: string
- name: input_ids
sequence: int32
- name: attention_mask
sequence: int8
- name: mean
sequence: float32
- name: std
sequence: float32
splits:
- name: train
num_bytes: 82790000
num_examples: 10000
download_size: 16936363
dataset_size: 82790000
- config_name: Burgers_Sols_Nu0.1
features:
- name: parameters
dtype: string
- name: input_ids
sequence: int32
- name: attention_mask
sequence: int8
- name: mean
sequence: float32
- name: std
sequence: float32
splits:
- name: train
num_bytes: 82780000
num_examples: 10000
download_size: 16861353
dataset_size: 82780000
- config_name: Burgers_Sols_Nu0.2
features:
- name: parameters
dtype: string
- name: input_ids
sequence: int32
- name: attention_mask
sequence: int8
- name: mean
sequence: float32
- name: std
sequence: float32
splits:
- name: train
num_bytes: 82780000
num_examples: 10000
download_size: 16397471
dataset_size: 82780000
- config_name: Burgers_Sols_Nu0.4
features:
- name: parameters
dtype: string
- name: input_ids
sequence: int32
- name: attention_mask
sequence: int8
- name: mean
sequence: float32
- name: std
sequence: float32
splits:
- name: train
num_bytes: 82780000
num_examples: 10000
download_size: 13875190
dataset_size: 82780000
- config_name: Burgers_Sols_Nu1.0
features:
- name: parameters
dtype: string
- name: input_ids
sequence: int32
- name: attention_mask
sequence: int8
- name: mean
sequence: float32
- name: std
sequence: float32
splits:
- name: train
num_bytes: 82780000
num_examples: 10000
download_size: 8523808
dataset_size: 82780000
- config_name: Burgers_Sols_Nu2.0
features:
- name: parameters
dtype: string
- name: input_ids
sequence: int32
- name: attention_mask
sequence: int8
- name: mean
sequence: float32
- name: std
sequence: float32
splits:
- name: train
num_bytes: 82780000
num_examples: 10000
download_size: 5054977
dataset_size: 82780000
- config_name: Burgers_Sols_Nu4.0
features:
- name: parameters
dtype: string
- name: input_ids
sequence: int32
- name: attention_mask
sequence: int8
- name: mean
sequence: float32
- name: std
sequence: float32
splits:
- name: train
num_bytes: 82780000
num_examples: 10000
download_size: 3240425
dataset_size: 82780000
configs:
- config_name: Burgers_Sols_Nu0.001
data_files:
- split: train
path: Burgers_Sols_Nu0.001/train-*
- config_name: Burgers_Sols_Nu0.002
data_files:
- split: train
path: Burgers_Sols_Nu0.002/train-*
- config_name: Burgers_Sols_Nu0.004
data_files:
- split: train
path: Burgers_Sols_Nu0.004/train-*
- config_name: Burgers_Sols_Nu0.01
data_files:
- split: train
path: Burgers_Sols_Nu0.01/train-*
- config_name: Burgers_Sols_Nu0.02
data_files:
- split: train
path: Burgers_Sols_Nu0.02/train-*
- config_name: Burgers_Sols_Nu0.04
data_files:
- split: train
path: Burgers_Sols_Nu0.04/train-*
- config_name: Burgers_Sols_Nu0.1
data_files:
- split: train
path: Burgers_Sols_Nu0.1/train-*
- config_name: Burgers_Sols_Nu0.2
data_files:
- split: train
path: Burgers_Sols_Nu0.2/train-*
- config_name: Burgers_Sols_Nu0.4
data_files:
- split: train
path: Burgers_Sols_Nu0.4/train-*
- config_name: Burgers_Sols_Nu1.0
data_files:
- split: train
path: Burgers_Sols_Nu1.0/train-*
- config_name: Burgers_Sols_Nu2.0
data_files:
- split: train
path: Burgers_Sols_Nu2.0/train-*
- config_name: Burgers_Sols_Nu4.0
data_files:
- split: train
path: Burgers_Sols_Nu4.0/train-*
---
提供机构:
sogeeking
原始信息汇总
数据集概述
数据集配置
Burgers_Sols_Nu0.001
- 特征:
parameters: 字符串input_ids: 整数序列 (int32)attention_mask: 整数序列 (int8)mean: 浮点数序列 (float32)std: 浮点数序列 (float32)
- 分割:
train:- 字节数: 82800000
- 样本数: 10000
- 下载大小: 16936560
- 数据集大小: 82800000
Burgers_Sols_Nu0.002
- 特征:
parameters: 字符串input_ids: 整数序列 (int32)attention_mask: 整数序列 (int8)mean: 浮点数序列 (float32)std: 浮点数序列 (float32)
- 分割:
train:- 字节数: 82800000
- 样本数: 10000
- 下载大小: 16914568
- 数据集大小: 82800000
Burgers_Sols_Nu0.004
- 特征:
parameters: 字符串input_ids: 整数序列 (int32)attention_mask: 整数序列 (int8)mean: 浮点数序列 (float32)std: 浮点数序列 (float32)
- 分割:
train:- 字节数: 82800000
- 样本数: 10000
- 下载大小: 16907704
- 数据集大小: 82800000
Burgers_Sols_Nu0.01
- 特征:
parameters: 字符串input_ids: 整数序列 (int32)attention_mask: 整数序列 (int8)mean: 浮点数序列 (float32)std: 浮点数序列 (float32)
- 分割:
train:- 字节数: 82790000
- 样本数: 10000
- 下载大小: 16920292
- 数据集大小: 82790000
Burgers_Sols_Nu0.02
- 特征:
parameters: 字符串input_ids: 整数序列 (int32)attention_mask: 整数序列 (int8)mean: 浮点数序列 (float32)std: 浮点数序列 (float32)
- 分割:
train:- 字节数: 82790000
- 样本数: 10000
- 下载大小: 16932816
- 数据集大小: 82790000
Burgers_Sols_Nu0.04
- 特征:
parameters: 字符串input_ids: 整数序列 (int32)attention_mask: 整数序列 (int8)mean: 浮点数序列 (float32)std: 浮点数序列 (float32)
- 分割:
train:- 字节数: 82790000
- 样本数: 10000
- 下载大小: 16936363
- 数据集大小: 82790000
Burgers_Sols_Nu0.1
- 特征:
parameters: 字符串input_ids: 整数序列 (int32)attention_mask: 整数序列 (int8)mean: 浮点数序列 (float32)std: 浮点数序列 (float32)
- 分割:
train:- 字节数: 82780000
- 样本数: 10000
- 下载大小: 16861353
- 数据集大小: 82780000
Burgers_Sols_Nu0.2
- 特征:
parameters: 字符串input_ids: 整数序列 (int32)attention_mask: 整数序列 (int8)mean: 浮点数序列 (float32)std: 浮点数序列 (float32)
- 分割:
train:- 字节数: 82780000
- 样本数: 10000
- 下载大小: 16397471
- 数据集大小: 82780000
Burgers_Sols_Nu0.4
- 特征:
parameters: 字符串input_ids: 整数序列 (int32)attention_mask: 整数序列 (int8)mean: 浮点数序列 (float32)std: 浮点数序列 (float32)
- 分割:
train:- 字节数: 82780000
- 样本数: 10000
- 下载大小: 13875190
- 数据集大小: 82780000
Burgers_Sols_Nu1.0
- 特征:
parameters: 字符串input_ids: 整数序列 (int32)attention_mask: 整数序列 (int8)mean: 浮点数序列 (float32)std: 浮点数序列 (float32)
- 分割:
train:- 字节数: 82780000
- 样本数: 10000
- 下载大小: 8523808
- 数据集大小: 82780000
Burgers_Sols_Nu2.0
- 特征:
parameters: 字符串input_ids: 整数序列 (int32)attention_mask: 整数序列 (int8)mean: 浮点数序列 (float32)std: 浮点数序列 (float32)
- 分割:
train:- 字节数: 82780000
- 样本数: 10000
- 下载大小: 5054977
- 数据集大小: 82780000
Burgers_Sols_Nu4.0
- 特征:
parameters: 字符串input_ids: 整数序列 (int32)attention_mask: 整数序列 (int8)mean: 浮点数序列 (float32)std: 浮点数序列 (float32)
- 分割:
train:- 字节数: 82780000
- 样本数: 10000
- 下载大小: 3240425
- 数据集大小: 82780000
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集是一个文本和时间序列数据集,包含120,000行数据,总大小为166 MB,以parquet格式存储。它由12个子集组成,每个子集对应不同的'Burgers_Sols_Nu'参数,数据字段包括'input_ids'、'attention_mask'、'mean'和'std',适用于训练任务。
以上内容由遇见数据集搜集并总结生成



