five

AlignmentResearch/EnronSpam-test

收藏
Hugging Face2024-07-26 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/AlignmentResearch/EnronSpam-test
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: default features: - name: clf_label dtype: class_label: names: '0': ' HAM' '1': ' SPAM' - name: instructions dtype: string - name: content sequence: string - name: answer_prompt dtype: string - name: proxy_clf_label dtype: class_label: names: '0': ' HAM' '1': ' SPAM' - name: gen_target dtype: string - name: proxy_gen_target dtype: string splits: - name: train num_bytes: 33986682.0 num_examples: 29341 - name: validation num_bytes: 2127298.0 num_examples: 1852 download_size: 18264192 dataset_size: 36113980.0 - config_name: neg features: - name: clf_label dtype: class_label: names: '0': ' HAM' '1': ' SPAM' - name: instructions dtype: string - name: content sequence: string - name: answer_prompt dtype: string - name: proxy_clf_label dtype: class_label: names: '0': ' HAM' '1': ' SPAM' - name: gen_target dtype: string - name: proxy_gen_target dtype: string splits: - name: train num_bytes: 16627886.57884871 num_examples: 14355 - name: validation num_bytes: 1047567.9136069114 num_examples: 912 download_size: 8694638 dataset_size: 17675454.49245562 - config_name: pos features: - name: clf_label dtype: class_label: names: '0': ' HAM' '1': ' SPAM' - name: instructions dtype: string - name: content sequence: string - name: answer_prompt dtype: string - name: proxy_clf_label dtype: class_label: names: '0': ' HAM' '1': ' SPAM' - name: gen_target dtype: string - name: proxy_gen_target dtype: string splits: - name: train num_bytes: 17358795.42115129 num_examples: 14986 - name: validation num_bytes: 1079730.0863930886 num_examples: 940 download_size: 9177507 dataset_size: 18438525.50754438 configs: - config_name: default data_files: - split: train path: data/train-* - split: validation path: data/validation-* - config_name: neg data_files: - split: train path: neg/train-* - split: validation path: neg/validation-* - config_name: pos data_files: - split: train path: pos/train-* - split: validation path: pos/validation-* ---
提供机构:
AlignmentResearch
原始信息汇总

数据集概述

配置名称:default

  • 特征信息:
    • clf_label: int64
    • instructions: string
    • content: sequence: string
    • answer_prompt: string
    • gen_target: string
  • 数据分割:
    • train: 29341 examples, 33473530 bytes
    • validation: 1852 examples, 2094902 bytes
  • 下载大小: 18235776 bytes
  • 数据集大小: 35568432.0 bytes

配置名称:neg

  • 特征信息:
    • clf_label: int64
    • instructions: string
    • content: sequence: string
    • answer_prompt: string
    • gen_target: string
  • 数据分割:
    • train: 14355 examples, 16376828.436317781 bytes
    • validation: 912 examples, 1031614.807775378 bytes
  • 下载大小: 8684176 bytes
  • 数据集大小: 17408443.244093157 bytes

配置名称:pos

  • 特征信息:
    • clf_label: int64
    • instructions: string
    • content: sequence: string
    • answer_prompt: string
    • gen_target: string
  • 数据分割:
    • train: 14986 examples, 17096701.56368222 bytes
    • validation: 940 examples, 1063287.192224622 bytes
  • 下载大小: 9167157 bytes
  • 数据集大小: 18159988.755906843 bytes
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作