pp2-project4/embeddings_test_ap_fixed
收藏Hugging Face2025-12-12 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/pp2-project4/embeddings_test_ap_fixed
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: post_training_filtered
features:
- name: id
dtype: string
- name: embedding
list:
list: float64
- name: sequence
dtype: string
splits:
- name: '0_104'
num_bytes: 252322314
num_examples: 104
- name: '104_208'
num_bytes: 270306548
num_examples: 104
- name: '208_312'
num_bytes: 268954023
num_examples: 104
- name: '312_416'
num_bytes: 255650300
num_examples: 104
- name: '416_520'
num_bytes: 248207440
num_examples: 104
- name: '520_624'
num_bytes: 234764340
num_examples: 104
- name: '624_728'
num_bytes: 266863808
num_examples: 104
- name: '728_765'
num_bytes: 73462328
num_examples: 37
download_size: 441748438
dataset_size: 1870531101
- config_name: pre_training_matched_to_post_training
features:
- name: id
dtype: string
- name: embedding
list:
list: float64
- name: sequence
dtype: string
splits:
- name: '0_104'
num_bytes: 229411471
num_examples: 104
- name: '104_208'
num_bytes: 225681784
num_examples: 104
- name: '208_312'
num_bytes: 206099211
num_examples: 104
- name: '312_416'
num_bytes: 203197481
num_examples: 104
- name: '416_520'
num_bytes: 188516606
num_examples: 104
- name: '520_624'
num_bytes: 199098921
num_examples: 104
- name: '624_728'
num_bytes: 270052153
num_examples: 104
- name: '728_765'
num_bytes: 115471825
num_examples: 37
download_size: 386824777
dataset_size: 1637529452
configs:
- config_name: post_training_filtered
data_files:
- split: '0_104'
path: post_training_filtered/0_104-*
- split: '104_208'
path: post_training_filtered/104_208-*
- split: '208_312'
path: post_training_filtered/208_312-*
- split: '312_416'
path: post_training_filtered/312_416-*
- split: '416_520'
path: post_training_filtered/416_520-*
- split: '520_624'
path: post_training_filtered/520_624-*
- split: '624_728'
path: post_training_filtered/624_728-*
- split: '728_765'
path: post_training_filtered/728_765-*
- config_name: pre_training_matched_to_post_training
data_files:
- split: '0_104'
path: pre_training_matched_to_post_training/0_104-*
- split: '104_208'
path: pre_training_matched_to_post_training/104_208-*
- split: '208_312'
path: pre_training_matched_to_post_training/208_312-*
- split: '312_416'
path: pre_training_matched_to_post_training/312_416-*
- split: '416_520'
path: pre_training_matched_to_post_training/416_520-*
- split: '520_624'
path: pre_training_matched_to_post_training/520_624-*
- split: '624_728'
path: pre_training_matched_to_post_training/624_728-*
- split: '728_765'
path: pre_training_matched_to_post_training/728_765-*
---
数据集信息:
- 配置名:后训练过滤版(post_training_filtered)
特征:
- 字段名:id,数据类型:字符串
- 字段名:embedding,数据类型:二维64位浮点型列表(list of list of float64)
- 字段名:sequence,数据类型:字符串
数据划分:
- 划分名:0_104,占用字节数:252322314,样本数量:104
- 划分名:104_208,占用字节数:270306548,样本数量:104
- 划分名:208_312,占用字节数:268954023,样本数量:104
- 划分名:312_416,占用字节数:255650300,样本数量:104
- 划分名:416_520,占用字节数:248207440,样本数量:104
- 划分名:520_624,占用字节数:234764340,样本数量:104
- 划分名:624_728,占用字节数:266863808,样本数量:104
- 划分名:728_765,占用字节数:73462328,样本数量:37
下载总大小:441748438 字节,数据集总占用大小:1870531101 字节
- 配置名:与后训练过滤版匹配的预训练版(pre_training_matched_to_post_training)
特征:
- 字段名:id,数据类型:字符串
- 字段名:embedding,数据类型:二维64位浮点型列表(list of list of float64)
- 字段名:sequence,数据类型:字符串
数据划分:
- 划分名:0_104,占用字节数:229411471,样本数量:104
- 划分名:104_208,占用字节数:225681784,样本数量:104
- 划分名:208_312,占用字节数:206099211,样本数量:104
- 划分名:312_416,占用字节数:203197481,样本数量:104
- 划分名:416_520,占用字节数:188516606,样本数量:104
- 划分名:520_624,占用字节数:199098921,样本数量:104
- 划分名:624_728,占用字节数:270052153,样本数量:104
- 划分名:728_765,占用字节数:115471825,样本数量:37
下载总大小:386824777 字节,数据集总占用大小:1637529452 字节
配置列表:
- 配置名:后训练过滤版(post_training_filtered),关联数据文件:
- 划分:0_104,文件路径:post_training_filtered/0_104-*
- 划分:104_208,文件路径:post_training_filtered/104_208-*
- 划分:208_312,文件路径:post_training_filtered/208_312-*
- 划分:312_416,文件路径:post_training_filtered/312_416-*
- 划分:416_520,文件路径:post_training_filtered/416_520-*
- 划分:520_624,文件路径:post_training_filtered/520_624-*
- 划分:624_728,文件路径:post_training_filtered/624_728-*
- 划分:728_765,文件路径:post_training_filtered/728_765-*
- 配置名:与后训练过滤版匹配的预训练版(pre_training_matched_to_post_training),关联数据文件:
- 划分:0_104,文件路径:pre_training_matched_to_post_training/0_104-*
- 划分:104_208,文件路径:pre_training_matched_to_post_training/104_208-*
- 划分:208_312,文件路径:pre_training_matched_to_post_training/208_312-*
- 划分:312_416,文件路径:pre_training_matched_to_post_training/312_416-*
- 划分:416_520,文件路径:pre_training_matched_to_post_training/416_520-*
- 划分:520_624,文件路径:pre_training_matched_to_post_training/520_624-*
- 划分:624_728,文件路径:pre_training_matched_to_post_training/624_728-*
- 划分:728_765,文件路径:pre_training_matched_to_post_training/728_765-*
提供机构:
pp2-project4



