pharaouk/ultrafeedback-binarized-preferences-cleaned

Name: pharaouk/ultrafeedback-binarized-preferences-cleaned
Creator: pharaouk
Published: 2024-04-05 18:24:00
License: 暂无描述

Hugging Face2024-04-05 更新2024-06-11 收录

下载链接：

https://hf-mirror.com/datasets/pharaouk/ultrafeedback-binarized-preferences-cleaned

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: - en license: mit size_categories: - 10K<n<100K task_categories: - text-generation pretty_name: UltraFeedback Binarized Preferences Cleaned dataset_info: features: - name: source dtype: string - name: prompt dtype: string - name: chosen list: - name: content dtype: string - name: role dtype: string - name: chosen-rating dtype: float64 - name: chosen-model dtype: string - name: rejected list: - name: content dtype: string - name: role dtype: string - name: rejected-rating dtype: float64 - name: rejected-model dtype: string splits: - name: train num_bytes: 284937773 num_examples: 60917 download_size: 143257393 dataset_size: 284937773 configs: - config_name: default data_files: - split: train path: data/train-* tags: - dpo - preference - ultrafeedback --- # UltraFeedback - Binarized using the Average of Preference Ratings (Cleaned) This dataset represents a new iteration on top of [`argilla/ultrafeedback-binarized-preferences`](https://huggingface.co/argilla/ultrafeedback-binarized-preferences), and is the **recommended and preferred dataset by Argilla to use from now on when fine-tuning on UltraFeedback**. Read more about Argilla's approach towards UltraFeedback binarization at [`argilla/ultrafeedback-binarized-preferences/README.md`](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences/blob/main/README.md). ## Differences with `argilla/ultrafeedback-binarized-preferences` Thanks to the recent issue identified by [AllenAI](https://huggingface.co/allenai) related to the TruthfulQA contamination within the original UltraFeedback dataset due to some prompts being reused from the TruthfulQA dataset (used for benchmarking in the [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) from HuggingFace H4), we also decided to follow AllenAI's advice and remove those from the UltraFeedback dataset that we binarized using a completely different approach, which implied using the average of the preference ratings rather than the critique overall score, as [`HuggingFaceH4/ultrafeedback_binarized`](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized) did. Besides that, we also saw that not only the rows with the `source=truthful_qa` were contamined (for obvious reasons), but also some coming from ShareGPT, so we also removed those doing a left join with both subsets from the [`truthful_qa`](https://huggingface.co/datasets/truthful_qa) dataset. Additionally, we also modified the formatting to be aligned with both [`HuggingFaceH4/ultrafeedback_binarized`](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized), and [`allenai/ultrafeedback_binarized_cleaned`](https://huggingface.co/datasets/allenai/ultrafeedback_binarized_cleaned) in order to ease the integration within the [`huggingface/alignment-handbook`](https://github.com/huggingface/alignment-handbook) so that the formatting is standardized. ## Reproduce <a target="_blank" href="https://colab.research.google.com/drive/1XR9P1St4yTNY0tjti_tIjm-yzP5Bfqc0?usp=sharing"> <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> </a> To reproduce the data processing combining both our approach and the suggestions from HuggingFace H4 w.r.t. the formatting and the ones from AllenAI to remove the TruthfulQA contamination, feel free to run the attached Colab Notebook or just view it at [`notebook.ipynb`](./notebook.ipynb) within this repository. From Argilla we encourage anyone out there to play around, investigate, and experiment with the data, and we firmly believe on open sourcing what we do, as ourselves, as well as the whole community, benefit a lot from open source and we also want to give back. ## Citation If you find this dataset is useful in your work, please cite the original UltraFeedback dataset: https://huggingface.co/datasets/openbmb/UltraFeedback Additionally, you may also want to cite our work with Notus 7B, which lead the curation of the UltraFeedback dataset: ```bibtex @misc{notus2023, author = {Alvaro Bartolome and Gabriel Martin and Daniel Vila}, title = {Notus}, year = {2023}, publisher = {GitHub}, journal = {GitHub Repository}, howpublished = {\url{https://github.com/argilla-io/notus}} } ``` > Alphabetically ordered by last name due to equal contribution.

提供机构：

pharaouk

原始信息汇总

数据集概述

基本信息

语言: 英语
许可证: MIT
大小分类: 10K<n<100K
任务分类: 文本生成
美观名称: UltraFeedback Binarized Preferences Cleaned

数据集特征

source: 字符串类型
prompt: 字符串类型
chosen: 列表类型，包含
- content: 字符串类型
- role: 字符串类型
chosen-rating: 浮点数类型
chosen-model: 字符串类型
rejected: 列表类型，包含
- content: 字符串类型
- role: 字符串类型
rejected-rating: 浮点数类型
rejected-model: 字符串类型

数据集分割

训练集:
- 字节数: 284937773
- 示例数: 60917
下载大小: 143257393
数据集大小: 284937773

配置

默认配置:
- 数据文件:
  - 分割: 训练
  - 路径: data/train-*

pharaouk/ultrafeedback-binarized-preferences-cleaned

数据集概述

基本信息

数据集特征

数据集分割

配置

标签