stefanocarrera/autophagycode_D_he_train-mercury_Qwen3-0.6B_strategy_trust_t1_g7_metrics
收藏Hugging Face2026-04-29 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/stefanocarrera/autophagycode_D_he_train-mercury_Qwen3-0.6B_strategy_trust_t1_g7_metrics
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个代码相关数据集,包含164个样本,仅用于训练。数据特征包括任务标识(task_id)、入口点(entry_point)、可执行状态(is_executable)、正确性(is_correct)、通过和失败的测试数量(tests_passed、tests_failed)、测试运行时间(test_run_time_ms)、错误类型(error_type)、Halstead复杂度指标(如词汇量、长度、体积、难度、努力和时间)、圈复杂度(cyclomatic_complexity)、维护性指数(maintainability_index)、代码行数(loc和sloc)、注释百分比(comment_percentage)、词汇多样性(TTR)、令牌字典(token_dict)、香农熵(shannon_entropy)、平均和最大预测熵(mean_predictive_entropy、max_predictive_entropy)、定义函数数量(n_func_defined)以及入口点重复性(entry_point_repeated)。这些特征用于评估代码质量、复杂度和可维护性,可能应用于编程教育、代码分析或自动化测试场景。
This dataset is a code-related dataset containing 164 samples, intended for training use only. The features include task identifier (task_id), entry point (entry_point), executable status (is_executable), correctness (is_correct), number of tests passed and failed (tests_passed, tests_failed), test run time (test_run_time_ms), error type (error_type), Halstead complexity metrics (such as vocabulary, length, volume, difficulty, effort, and time), cyclomatic complexity (cyclomatic_complexity), maintainability index (maintainability_index), lines of code (loc and sloc), comment percentage (comment_percentage), type-token ratio (TTR), token dictionary (token_dict), Shannon entropy (shannon_entropy), mean and maximum predictive entropy (mean_predictive_entropy, max_predictive_entropy), number of functions defined (n_func_defined), and entry point repetition (entry_point_repeated). These features are used to assess code quality, complexity, and maintainability, potentially applied in programming education, code analysis, or automated testing scenarios.
提供机构:
stefanocarrera



