five

HelpSteer-filtered

收藏
魔搭社区2026-01-07 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/Weyaxi/HelpSteer-filtered
下载链接
链接失效反馈
官方服务:
资源简介:
# HelpSteer-filtered This dataset is a highly filtered version of the [nvidia/HelpSteer](https://huggingface.co/datasets/nvidia/HelpSteer) dataset. # ❓ How this dataset was filtered: 1. I calculated the sum of the columns `["helpfulness," "correctness," "coherence," "complexity," "verbosity"]` and created a new column named `sum`. 2. I changed some column names and added a **empty column** to match the Alpaca format. 3. The dataset was then filtered to include only those entries with a sum greater than or equal to 16. # 🧐 More Information You can find more information about the unfiltered dataset here: - [nvidia/HelpSteer](https://huggingface.co/datasets/nvidia/HelpSteer)

# HelpSteer-filtered 本数据集为[nvidia/HelpSteer](https://huggingface.co/datasets/nvidia/HelpSteer)数据集的高精度过滤版本。 # ❓ 数据集过滤流程 1. 对`["有用性(helpfulness)", "正确性(correctness)", "连贯性(coherence)", "复杂度(complexity)", "冗长度(verbosity)"]`各列的数值进行求和运算,新增一列并命名为`sum`。 2. 修改部分列名,并新增一个**空列**以适配Alpaca格式(Alpaca)。 3. 随后对数据集进行过滤,仅保留`sum`列取值大于或等于16的样本条目。 # 🧐 补充信息 您可通过以下链接获取原始未过滤数据集的更多信息: - [nvidia/HelpSteer](https://huggingface.co/datasets/nvidia/HelpSteer)
提供机构:
maas
创建时间:
2025-08-29
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作