PulkundwarP/ARMed
收藏Hugging Face2025-04-11 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/PulkundwarP/ARMed
下载链接
链接失效反馈官方服务:
资源简介:
ARMed是一个针对嵌入式系统编程的领域特定评估基准数据集,包含100个经验证的答案和难度级别的多项选择题。该数据集由New Leap Labs的Parth Pulkundwar策划,用于测试语言模型在嵌入式系统编程知识方面的表现。数据集遵循Apache 2.0许可,适用于对语言模型进行评估和进行模型比较研究。每个数据行包括一个问题、四个选项、正确答案和难度等级。数据集通过Deepseek-R1生成并由嵌入式系统专家验证。该数据集不包含任何个人或敏感信息,但存在一些局限性,包括潜在的偏见和对非ARM嵌入式架构的不完整覆盖。
ARMed is a domain-specific evaluation benchmark dataset consisting of 100 multiple-choice questions with verified answers and difficulty levels focused on embedded systems programming. Curated by Parth Pulkundwar from New Leap Labs, it is designed to test the programming knowledge of language models in the field of embedded systems. The dataset is licensed under Apache 2.0 and is intended for use in evaluating language models and conducting model comparison studies. Each entry in the dataset includes a question, four choices, the correct answer, and a difficulty level. The dataset was created using Deepseek-R1 and verified by embedded systems experts. It does not contain any personal or sensitive information but has limitations, including potential biases and incomplete coverage of non-ARM embedded architectures.
提供机构:
PulkundwarP



