puddleglum/esm_chem_quarter
收藏Hugging Face2023-07-11 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/puddleglum/esm_chem_quarter
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: labels
sequence: int64
- name: reactions
sequence: int64
- name: highly_masked_sequences
sequence: int64
- name: binding_site_masked_sequences
sequence: int64
- name: attention_mask
sequence: int8
splits:
- name: train
num_bytes: 6572644710.888085
num_examples: 691480
- name: test
num_bytes: 1642676818.3613033
num_examples: 172850
download_size: 41290028
dataset_size: 8215321529.249389
---
# Dataset Card for "esm_chem_quarter"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
数据集信息:
特征:
- 名称:标签(labels),为64位整数序列
- 名称:反应序列(reactions),为64位整数序列
- 名称:高度掩码序列(highly_masked_sequences),为64位整数序列
- 名称:结合位点掩码序列(binding_site_masked_sequences),为64位整数序列
- 名称:注意力掩码(attention_mask),为8位整数序列
数据集拆分:
- 名称:训练集(train),字节占用量:6572644710.888085,样本数量:691480
- 名称:测试集(test),字节占用量:1642676818.3613033,样本数量:172850
下载大小:41290028
数据集总大小:8215321529.249389
---
## 「esm_chem_quarter」数据集卡片
[需补充更多信息](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
puddleglum
原始信息汇总
数据集概述
数据集名称
- 名称: esm_chem_quarter
数据集特征
- 特征列表:
- labels: 序列类型为 int64
- reactions: 序列类型为 int64
- highly_masked_sequences: 序列类型为 int64
- binding_site_masked_sequences: 序列类型为 int64
- attention_mask: 序列类型为 int8
数据集分割
- 训练集:
- 样本数量: 691480
- 数据大小: 6572644710.888085 字节
- 测试集:
- 样本数量: 172850
- 数据大小: 1642676818.3613033 字节
数据集大小
- 下载大小: 41290028 字节
- 数据集总大小: 8215321529.249389 字节



