orionriker/Mistral-HelpSteer

Name: orionriker/Mistral-HelpSteer
Creator: orionriker
Published: 2024-04-25 08:03:38
License: 暂无描述

Hugging Face2024-04-25 更新2024-06-12 收录

下载链接：

https://hf-mirror.com/datasets/orionriker/Mistral-HelpSteer

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: cc-by-4.0 language: - en pretty_name: Helpfulness SteerLM Dataset size_categories: - 10K<n<100K tags: - human-feedback --- # HelpSteer: Helpfulness SteerLM Dataset This dataset has been refined to work with Mistral-Instruct-based models. HelpSteer is an open-source Helpfulness Dataset (CC-BY-4.0) that supports aligning models to become more helpful, factually correct and coherent, while being adjustable in terms of the complexity and verbosity of its responses. Leveraging this dataset and SteerLM, we train a Llama 2 70B to reach **7.54** on MT Bench, the highest among models trained on open-source datasets based on [MT Bench Leaderboard](https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard) as of 15 Nov 2023. This model is available on HF at [Llama2-70B-SteerLM-Chat](https://huggingface.co/nvidia/Llama2-70B-SteerLM-Chat). Try this model instantly for free hosted by us at [NVIDIA AI Playground](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/models/llama2-70b-steerlm). You can use this in the provided UI or through a limited access API (up to 10, 000 requests within 30 days). If you would need more requests, we demonstrate how you can set up an inference server at [Llama2-70B-SteerLM-Chat model page on HF](https://huggingface.co/nvidia/Llama2-70B-SteerLM-Chat) You can also train a model using [NeMo Aligner](https://github.com/NVIDIA/NeMo-Aligner) following [SteerLM training user guide](https://docs.nvidia.com/nemo-framework/user-guide/latest/modelalignment/steerlm.html). <img src="https://huggingface.co/datasets/nvidia/HelpSteer/resolve/main/mtbench_categories.png" alt="MT Bench Categories" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/> HelpSteer Paper : [HelpSteer: Multi-attribute Helpfulness Dataset for SteerLM](http://arxiv.org/abs/2311.09528) SteerLM Paper: [SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF](https://arxiv.org/abs/2310.05344) ## Dataset Description HelpSteer contains 37, 120 samples, each containing a prompt, a response as well as five human-annotated attributes of the response, each ranging between 0 and 4 where higher means better for each attribute. These attributes are: 1. **Helpfulness**: Overall helpfulness of the response to the prompt. 2. **Correctness**: Inclusion of all pertinent facts without errors. 3. **Coherence**: Consistency and clarity of expression. 4. **Complexity**: Intellectual depth required to write response (i.e. whether the response can be written by anyone with basic language competency or requires deep domain expertise). 5. **Verbosity**: Amount of detail included in the response, relative to what is asked for in the prompt.

提供机构：

orionriker

原始信息汇总

数据集概述

数据集名称

HelpSteer: Helpfulness SteerLM Dataset

许可证

CC-BY-4.0

语言

英语

大小类别

10K<n<100K

数据集内容

包含37,120个样本，每个样本包括一个提示、一个响应以及五个由人类标注的属性评分，评分范围为0到4，评分越高表示属性表现越好。

属性详情

Helpfulness: 响应对提示的整体帮助性。
Correctness: 包含所有相关事实且无错误。
Coherence: 表达的一致性和清晰度。
Complexity: 撰写响应所需的知识深度。
Verbosity: 响应中包含的细节量，相对于提示所要求的。

5,000+

优质数据集

54 个

任务类型

进入经典数据集