OpenDataArena/OpenDataArena-scored-data-2603
收藏Hugging Face2026-04-27 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/OpenDataArena/OpenDataArena-scored-data-2603
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个多配置集合,包含多个子数据集,主要用于代码和数学领域的指令跟随任务。数据集配置包括AM-Thinking-v1-Distilled-code(约323,965个示例)、AM-Thinking-v1-Distilled-math(约558,129个示例)、Code-Feedback(约66,383个示例)、CodeFeedback-Filtered-Instruction(约156,526个示例)、DeepMath-103K(约309,066个示例)、EpiCoder-func-380k(约380,000个示例)和Evol-Instruct-Code-80k-v1(部分列出)。每个示例包含id、source、instruction(指令文本)和output(输出文本),以及processed_scores和raw_scores两个评分结构,涵盖AtheneRM、Cleanliness、LLM_as_Judge_Complexity、Compress_Ratio、Deita_Complexity、Deita_Quality、EmbedSVD_Entropy、Logical_Word_Count、HES、IFD、Instag、MTLD、Normalized_Loss、PPL、Professionalism、Writing_Style、Required_Expertise、Facts_Trivia、Educational_Value、Readability、Reasoning、SkyworkRM_Qwen、SkyworkRM_Llama、Token_Entropy、Token_Length、TreeInstruct_Node、TreeInstruct_Depth、Unique_Token_Ratio、UPD、VOCD-D等多个评估指标,用于量化文本质量、复杂性、专业性和可读性。数据集适用于训练和评估语言模型在代码生成、数学推理和反馈过滤等任务中的性能。
This dataset is a multi-configuration collection containing several sub-datasets, primarily designed for instruction-following tasks in code and mathematics domains. Dataset configurations include AM-Thinking-v1-Distilled-code (approximately 323,965 examples), AM-Thinking-v1-Distilled-math (approximately 558,129 examples), Code-Feedback (approximately 66,383 examples), CodeFeedback-Filtered-Instruction (approximately 156,526 examples), DeepMath-103K (approximately 309,066 examples), EpiCoder-func-380k (approximately 380,000 examples), and Evol-Instruct-Code-80k-v1 (partially listed). Each example includes id, source, instruction (instruction text), and output (output text), along with processed_scores and raw_scores structures that encompass multiple evaluation metrics such as AtheneRM, Cleanliness, LLM_as_Judge_Complexity, Compress_Ratio, Deita_Complexity, Deita_Quality, EmbedSVD_Entropy, Logical_Word_Count, HES, IFD, Instag, MTLD, Normalized_Loss, PPL, Professionalism, Writing_Style, Required_Expertise, Facts_Trivia, Educational_Value, Readability, Reasoning, SkyworkRM_Qwen, SkyworkRM_Llama, Token_Entropy, Token_Length, TreeInstruct_Node, TreeInstruct_Depth, Unique_Token_Ratio, UPD, and VOCD-D, which quantify text quality, complexity, professionalism, and readability. The dataset is suitable for training and evaluating language models on tasks like code generation, mathematical reasoning, and feedback filtering.
提供机构:
OpenDataArena



