rajendrabaskota/hc2-wiki-perplexity-stride-32-maxlen-128
收藏Hugging Face2024-02-08 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/rajendrabaskota/hc2-wiki-perplexity-stride-32-maxlen-128
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: prompt
dtype: string
- name: text
dtype: string
- name: source
dtype: string
- name: label
dtype: int64
- name: input_ids
sequence: int32
- name: attention_mask
sequence: int8
- name: perplexity_score
dtype: float64
splits:
- name: train1
num_bytes: 66550929
num_examples: 28154
- name: train2
num_bytes: 118875881
num_examples: 50000
- name: train3
num_bytes: 119260435
num_examples: 50000
- name: test
num_bytes: 41314265
num_examples: 17387
download_size: 182483264
dataset_size: 346001510
configs:
- config_name: default
data_files:
- split: train1
path: data/train1-*
- split: train2
path: data/train2-*
- split: train3
path: data/train3-*
- split: test
path: data/test-*
---
提供机构:
rajendrabaskota
原始信息汇总
数据集信息
特征
- prompt: 类型为字符串
- text: 类型为字符串
- source: 类型为字符串
- label: 类型为int64
- input_ids: 类型为int32序列
- attention_mask: 类型为int8序列
- perplexity_score: 类型为float64
数据分割
- train1: 字节数为66550929,样本数为28154
- train2: 字节数为118875881,样本数为50000
- train3: 字节数为119260435,样本数为50000
- test: 字节数为41314265,样本数为17387
数据大小
- 下载大小: 182483264字节
- 数据集大小: 346001510字节
配置
- config_name: default
- 数据文件:
- train1: 路径为data/train1-*
- train2: 路径为data/train2-*
- train3: 路径为data/train3-*
- test: 路径为data/test-*
- 数据文件:



