five

vwxyzjn/EleutherAI_pythia-1b-deduped__dpo_on_policy__tldr

收藏
Hugging Face2024-02-13 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/vwxyzjn/EleutherAI_pythia-1b-deduped__dpo_on_policy__tldr
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: chosen dtype: string - name: rejected dtype: string - name: chosen_token sequence: int64 - name: rejected_token sequence: int64 - name: chosen_token_label sequence: int64 - name: rejected_token_label sequence: int64 splits: - name: dpo_on_policy__1__1707191080 num_bytes: 5903392 num_examples: 256 - name: dpo_on_policy__1__1707191514 num_bytes: 737346 num_examples: 32 - name: dpo_on_policy__1__1707191827 num_bytes: 1474271 num_examples: 64 - name: dpo_on_policy__1__1707191954 num_bytes: 5903392 num_examples: 256 - name: dpo_on_policy__1__1707192216 num_bytes: 5903392 num_examples: 256 - name: dpo_on_policy__1__1707192515 num_bytes: 5903178 num_examples: 256 - name: dpo_on_policy__1__1707200734 num_bytes: 2686492433 num_examples: 116480 - name: dpo_on_policy__1__1707792349 num_bytes: 2686492433 num_examples: 116480 - name: dpo_on_policy__1__1707792340 num_bytes: 2686492433 num_examples: 116480 - name: epoch_1 num_bytes: 2691157952 num_examples: 116480 - name: dpo_on_policy__1__1707795707 num_bytes: 2686492433 num_examples: 116480 - name: epoch_2 num_bytes: 2722175510 num_examples: 116480 - name: epoch_3 num_bytes: 2690611469 num_examples: 116480 - name: dpo_on_policy__1__1707833448 num_bytes: 2686492433 num_examples: 116480 - name: dpo_on_policy__1__1707833448epoch_1 num_bytes: 2691512798 num_examples: 116480 download_size: 3859313963 dataset_size: 24253744865 --- # Dataset Card for "EleutherAI_pythia-1b-deduped__dpo_on_policy__tldr" [More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
vwxyzjn
原始信息汇总

数据集信息

特征

  • chosen: 类型为字符串
  • rejected: 类型为字符串
  • chosen_token: 序列类型为int64
  • rejected_token: 序列类型为int64
  • chosen_token_label: 序列类型为int64
  • rejected_token_label: 序列类型为int64

数据分割

  • dpo_on_policy__1__1707191080: 字节数为5903392,样本数为256
  • dpo_on_policy__1__1707191514: 字节数为737346,样本数为32
  • dpo_on_policy__1__1707191827: 字节数为1474271,样本数为64
  • dpo_on_policy__1__1707191954: 字节数为5903392,样本数为256
  • dpo_on_policy__1__1707192216: 字节数为5903392,样本数为256
  • dpo_on_policy__1__1707192515: 字节数为5903178,样本数为256
  • dpo_on_policy__1__1707200734: 字节数为2686492433,样本数为116480
  • dpo_on_policy__1__1707792349: 字节数为2686492433,样本数为116480
  • dpo_on_policy__1__1707792340: 字节数为2686492433,样本数为116480
  • epoch_1: 字节数为2691157952,样本数为116480
  • dpo_on_policy__1__1707795707: 字节数为2686492433,样本数为116480
  • epoch_2: 字节数为2722175510,样本数为116480
  • epoch_3: 字节数为2690611469,样本数为116480
  • dpo_on_policy__1__1707833448: 字节数为2686492433,样本数为116480
  • dpo_on_policy__1__1707833448epoch_1: 字节数为2691512798,样本数为116480

数据集大小

  • 下载大小: 3859313963字节
  • 数据集大小: 24253744865字节
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作