blenderwang/cellxgene-40m-protein_coding-human_mouse
收藏Hugging Face2024-04-04 更新2024-06-11 收录
下载链接:
https://hf-mirror.com/datasets/blenderwang/cellxgene-40m-protein_coding-human_mouse
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: input_ids
sequence: uint16
- name: exprs
sequence: float32
- name: length
dtype: uint16
- name: species
dtype:
class_label:
names:
'0': Homo_sapiens
'1': Mus_musculus
- name: dataset_id
dtype: int16
- name: cell_id
dtype: int32
splits:
- name: train
num_bytes: 492949247934
num_examples: 40356216
download_size: 204510051997
dataset_size: 492949247934
configs:
- config_name: default
data_files:
- path: data/train-*
split: train
- config_name: subset
data_files:
- path:
- data/train-00000-of-00986.parquet
- data/train-00001-of-00986.parquet
- data/train-00002-of-00986.parquet
- data/train-00003-of-00986.parquet
- data/train-00004-of-00986.parquet
- data/train-00005-of-00986.parquet
- data/train-00006-of-00986.parquet
- data/train-00007-of-00986.parquet
- data/train-00008-of-00986.parquet
- data/train-00009-of-00986.parquet
- data/train-00010-of-00986.parquet
- data/train-00011-of-00986.parquet
- data/train-00012-of-00986.parquet
- data/train-00013-of-00986.parquet
- data/train-00014-of-00986.parquet
- data/train-00015-of-00986.parquet
- data/train-00016-of-00986.parquet
- data/train-00017-of-00986.parquet
- data/train-00018-of-00986.parquet
- data/train-00019-of-00986.parquet
- data/train-00020-of-00986.parquet
- data/train-00021-of-00986.parquet
- data/train-00022-of-00986.parquet
- data/train-00023-of-00986.parquet
- data/train-00024-of-00986.parquet
- data/train-00025-of-00986.parquet
- data/train-00026-of-00986.parquet
- data/train-00027-of-00986.parquet
- data/train-00028-of-00986.parquet
- data/train-00029-of-00986.parquet
- data/train-00030-of-00986.parquet
- data/train-00031-of-00986.parquet
- data/train-00032-of-00986.parquet
- data/train-00033-of-00986.parquet
- data/train-00034-of-00986.parquet
- data/train-00035-of-00986.parquet
- data/train-00036-of-00986.parquet
- data/train-00037-of-00986.parquet
- data/train-00038-of-00986.parquet
- data/train-00039-of-00986.parquet
- data/train-00040-of-00986.parquet
- data/train-00041-of-00986.parquet
- data/train-00042-of-00986.parquet
- data/train-00043-of-00986.parquet
- data/train-00044-of-00986.parquet
- data/train-00045-of-00986.parquet
- data/train-00046-of-00986.parquet
- data/train-00047-of-00986.parquet
- data/train-00048-of-00986.parquet
- data/train-00049-of-00986.parquet
split: train
---
提供机构:
blenderwang
原始信息汇总
数据集信息
特征
- input_ids: 序列类型,数据类型为
uint16 - exprs: 序列类型,数据类型为
float32 - length: 数据类型为
uint16 - species: 数据类型为
class_label,类别名称为Homo_sapiens和Mus_musculus - dataset_id: 数据类型为
int16 - cell_id: 数据类型为
int32
数据分割
- train: 包含 40,356,216 个样本,总字节数为 492,949,247,934
数据集大小
- 下载大小: 204,510,051,997 字节
- 数据集大小: 492,949,247,934 字节
配置
- default:
- 数据文件路径:
data/train-* - 分割:
train
- 数据文件路径:
- subset:
- 数据文件路径:
data/train-00000-of-00986.parquet至data/train-00049-of-00986.parquet
- 分割:
train
- 数据文件路径:



