five

cornfieldrm/datamix-v6.0

收藏
Hugging Face2024-04-24 更新2025-04-26 收录
下载链接:
https://hf-mirror.com/datasets/cornfieldrm/datamix-v6.0
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: dataset dtype: string - name: prompt_source dtype: string - name: response_model dtype: string - name: messages list: - name: content dtype: string - name: role dtype: string - name: helpsteer-helpfulness dtype: float64 - name: helpsteer-correctness dtype: float64 - name: helpsteer-coherence dtype: float64 - name: helpsteer-complexity dtype: float64 - name: helpsteer-verbosity dtype: float64 - name: ultrafeedback-overall_score dtype: float64 - name: ultrafeedback-instruction_following dtype: float64 - name: ultrafeedback-truthfulness dtype: float64 - name: ultrafeedback-honesty dtype: float64 - name: ultrafeedback-helpfulness dtype: float64 - name: argilla-overall_quality dtype: float64 - name: code-complexity dtype: float64 - name: code-style dtype: float64 - name: code-explanation dtype: float64 - name: code-instruction-following dtype: float64 - name: code-readability dtype: float64 - name: llama_guard2-is_safe dtype: float64 splits: - name: train num_bytes: 863839310 num_examples: 366190 download_size: 277791978 dataset_size: 863839310 configs: - config_name: default data_files: - split: train path: data/train-* ---

数据集信息: 特征字段: - 字段名:数据集(dataset),数据类型:字符串 - 字段名:提示词来源(prompt_source),数据类型:字符串 - 字段名:响应生成模型(response_model),数据类型:字符串 - 字段名:对话消息列表(messages),列表类型,列表项包含: - 字段名:内容(content),数据类型:字符串 - 字段名:角色(role),数据类型:字符串 - 字段名:HelpSteer-有用性(helpsteer-helpfulness),数据类型:64位浮点数(float64) - 字段名:HelpSteer-正确性(helpsteer-correctness),数据类型:64位浮点数(float64) - 字段名:HelpSteer-连贯性(helpsteer-coherence),数据类型:64位浮点数(float64) - 字段名:HelpSteer-复杂度(helpsteer-complexity),数据类型:64位浮点数(float64) - 字段名:HelpSteer-冗长度(helpsteer-verbosity),数据类型:64位浮点数(float64) - 字段名:UltraFeedback-整体评分(ultrafeedback-overall_score),数据类型:64位浮点数(float64) - 字段名:UltraFeedback-指令遵循度(ultrafeedback-instruction_following),数据类型:64位浮点数(float64) - 字段名:UltraFeedback-真实性(ultrafeedback-truthfulness),数据类型:64位浮点数(float64) - 字段名:UltraFeedback-诚实性(ultrafeedback-honesty),数据类型:64位浮点数(float64) - 字段名:UltraFeedback-有用性(ultrafeedback-helpfulness),数据类型:64位浮点数(float64) - 字段名:Argilla-整体质量(argilla-overall_quality),数据类型:64位浮点数(float64) - 字段名:代码复杂度(code-complexity),数据类型:64位浮点数(float64) - 字段名:代码风格(code-style),数据类型:64位浮点数(float64) - 字段名:代码解释性(code-explanation),数据类型:64位浮点数(float64) - 字段名:代码指令遵循度(code-instruction-following),数据类型:64位浮点数(float64) - 字段名:代码可读性(code-readability),数据类型:64位浮点数(float64) - 字段名:LlamaGuard 2-安全性检测结果(llama_guard2-is_safe),数据类型:64位浮点数(float64) 数据集划分: - 划分名称:训练集(train),字节数:863839310,样本数量:366190 下载大小:277791978 数据集存储大小:863839310 数据集配置: - 配置名称:默认(default),数据文件: - 划分:训练集(train),路径:data/train-*
提供机构:
cornfieldrm
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作