HelpSteer-filtered
收藏魔搭社区2026-01-07 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/Weyaxi/HelpSteer-filtered
下载链接
链接失效反馈官方服务:
资源简介:
# HelpSteer-filtered
This dataset is a highly filtered version of the [nvidia/HelpSteer](https://huggingface.co/datasets/nvidia/HelpSteer) dataset.
# ❓ How this dataset was filtered:
1. I calculated the sum of the columns `["helpfulness," "correctness," "coherence," "complexity," "verbosity"]` and created a new column named `sum`.
2. I changed some column names and added a **empty column** to match the Alpaca format.
3. The dataset was then filtered to include only those entries with a sum greater than or equal to 16.
# 🧐 More Information
You can find more information about the unfiltered dataset here:
- [nvidia/HelpSteer](https://huggingface.co/datasets/nvidia/HelpSteer)
# HelpSteer-filtered
本数据集为[nvidia/HelpSteer](https://huggingface.co/datasets/nvidia/HelpSteer)数据集的高精度过滤版本。
# ❓ 数据集过滤流程
1. 对`["有用性(helpfulness)", "正确性(correctness)", "连贯性(coherence)", "复杂度(complexity)", "冗长度(verbosity)"]`各列的数值进行求和运算,新增一列并命名为`sum`。
2. 修改部分列名,并新增一个**空列**以适配Alpaca格式(Alpaca)。
3. 随后对数据集进行过滤,仅保留`sum`列取值大于或等于16的样本条目。
# 🧐 补充信息
您可通过以下链接获取原始未过滤数据集的更多信息:
- [nvidia/HelpSteer](https://huggingface.co/datasets/nvidia/HelpSteer)
提供机构:
maas
创建时间:
2025-08-29



