bigcode/bigcodebench-complete-perf
收藏Hugging Face2024-07-02 更新2024-07-06 收录
下载链接:
https://hf-mirror.com/datasets/bigcode/bigcodebench-complete-perf
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: task_id
dtype: string
- name: status
dtype: int64
splits:
- name: Magicoder_S_DS_6.7B
num_bytes: 31950
num_examples: 1140
- name: StarCoder2_15B_Instruct_v0.1
num_bytes: 31950
num_examples: 1140
- name: StarCoder2_3B
num_bytes: 31950
num_examples: 1140
- name: StarCoder2_7B
num_bytes: 31950
num_examples: 1140
- name: StarCoder2_15B
num_bytes: 31950
num_examples: 1140
- name: CodeQwen1.5_7B
num_bytes: 31950
num_examples: 1140
- name: CodeGemma_2B
num_bytes: 31950
num_examples: 1140
- name: CodeGemma_7B
num_bytes: 31950
num_examples: 1140
- name: CodeGemma_7B_Instruct
num_bytes: 31950
num_examples: 1140
- name: GPT_3.5_Turbo_0125
num_bytes: 31950
num_examples: 1140
- name: GPT_4o_2024_05_13
num_bytes: 31950
num_examples: 1140
- name: GPT_4_Turbo_2024_04_09
num_bytes: 31950
num_examples: 1140
- name: GPT_4_0613
num_bytes: 31950
num_examples: 1140
- name: CodeLlama_7B_Base
num_bytes: 31950
num_examples: 1140
- name: CodeLlama_13B_Base
num_bytes: 31950
num_examples: 1140
- name: CodeLlama_7B_Instruct
num_bytes: 31950
num_examples: 1140
- name: CodeLlama_13B_Instruct
num_bytes: 31950
num_examples: 1140
- name: Mistral_Large_2402
num_bytes: 31950
num_examples: 1140
- name: Mistral_Small_2402
num_bytes: 31950
num_examples: 1140
- name: Mixtral_8x22B_Base
num_bytes: 31950
num_examples: 1140
- name: Mixtral_8x22B_Instruct
num_bytes: 31950
num_examples: 1140
- name: CodeLlama_34B_Base
num_bytes: 31950
num_examples: 1140
- name: CodeLlama_34B_Instruct
num_bytes: 31950
num_examples: 1140
- name: CodeLlama_70B_Base
num_bytes: 31950
num_examples: 1140
- name: CodeLlama_70B_Instruct
num_bytes: 31950
num_examples: 1140
- name: CodeQwen1.5_7B_Chat
num_bytes: 31950
num_examples: 1140
- name: Qwen1.5_110B_Chat
num_bytes: 31950
num_examples: 1140
- name: Qwen1.5_72B_Chat
num_bytes: 31950
num_examples: 1140
- name: Qwen1.5_32B_Chat
num_bytes: 31950
num_examples: 1140
- name: DeepSeek_V2_Chat
num_bytes: 31950
num_examples: 1140
- name: DeepSeek_Coder_1.3B_Base
num_bytes: 31950
num_examples: 1140
- name: DeepSeek_Coder_1.3B_Instruct
num_bytes: 31950
num_examples: 1140
- name: DeepSeek_Coder_33B_Base
num_bytes: 31950
num_examples: 1140
- name: DeepSeek_Coder_33B_Instruct
num_bytes: 31950
num_examples: 1140
- name: DeepSeek_Coder_6.7B_Base
num_bytes: 31950
num_examples: 1140
- name: DeepSeek_Coder_6.7B_Instruct
num_bytes: 31950
num_examples: 1140
- name: Llama_3_70B_Base
num_bytes: 31950
num_examples: 1140
- name: Llama_3_70B_Instruct
num_bytes: 31950
num_examples: 1140
- name: Llama_3_8B_Base
num_bytes: 31950
num_examples: 1140
- name: Llama_3_8B_Instruct
num_bytes: 31950
num_examples: 1140
- name: Granite_Code_3B_Instruct
num_bytes: 31950
num_examples: 1140
- name: Granite_Code_8B_Instruct
num_bytes: 31950
num_examples: 1140
- name: Granite_Code_20B_Instruct
num_bytes: 31950
num_examples: 1140
- name: Granite_Code_34B_Instruct
num_bytes: 31950
num_examples: 1140
- name: Granite_Code_3B_Base
num_bytes: 31950
num_examples: 1140
- name: Granite_Code_8B_Base
num_bytes: 31950
num_examples: 1140
- name: Granite_Code_20B_Base
num_bytes: 31950
num_examples: 1140
- name: Granite_Code_34B_Base
num_bytes: 31950
num_examples: 1140
- name: Claude_3_Haiku_20240307
num_bytes: 31950
num_examples: 1140
- name: Claude_3_Sonnet_20240229
num_bytes: 31950
num_examples: 1140
- name: Claude_3_Opus_20240229
num_bytes: 31950
num_examples: 1140
- name: Yi_1.5_34B_Chat
num_bytes: 31950
num_examples: 1140
- name: Yi_1.5_34B
num_bytes: 31950
num_examples: 1140
- name: Yi_1.5_9B_Chat
num_bytes: 31950
num_examples: 1140
- name: Yi_1.5_9B
num_bytes: 31950
num_examples: 1140
- name: Yi_1.5_6B_Chat
num_bytes: 31950
num_examples: 1140
- name: Yi_1.5_6B
num_bytes: 31950
num_examples: 1140
- name: Qwen2_57B_A14B
num_bytes: 31950
num_examples: 1140
- name: Qwen2_7B_Instruct
num_bytes: 31950
num_examples: 1140
- name: Qwen2_72B_Chat
num_bytes: 31950
num_examples: 1140
- name: Gemini_1.5_Pro_API_0514
num_bytes: 31950
num_examples: 1140
- name: Gemini_1.5_Flash_API_0514
num_bytes: 31950
num_examples: 1140
- name: OpenCodeInterpreter_DS_33B
num_bytes: 31950
num_examples: 1140
- name: OpenCodeInterpreter_DS_6.7B
num_bytes: 31950
num_examples: 1140
- name: OpenCodeInterpreter_DS_1.3B
num_bytes: 31950
num_examples: 1140
- name: Phi_3_medium_128k_instruct
num_bytes: 31950
num_examples: 1140
- name: Phi_3_small_128k_instruct
num_bytes: 31950
num_examples: 1140
- name: Codestral_22B_v0.1
num_bytes: 31950
num_examples: 1140
- name: Mistral_7B_Instruct_v0.3
num_bytes: 31950
num_examples: 1140
- name: Mistral_7B_v0.3
num_bytes: 31950
num_examples: 1140
- name: Command_R_plus
num_bytes: 31950
num_examples: 1140
- name: DeepSeek_Coder_V2_Instruct
num_bytes: 31950
num_examples: 1140
- name: DeepSeek_Coder_V2_Lite_Instruct
num_bytes: 31950
num_examples: 1140
- name: DeepSeek_Coder_V2_Lite_Base
num_bytes: 31950
num_examples: 1140
- name: Claude_3.5_Sonnet_20240620
num_bytes: 31950
num_examples: 1140
- name: Hermes_2_Theta_Llama_3_70B
num_bytes: 31950
num_examples: 1140
- name: WaveCoder_Ultra_6.7B
num_bytes: 31950
num_examples: 1140
- name: Gemma_2_9B_Instruct
num_bytes: 31950
num_examples: 1140
- name: AutoCoder
num_bytes: 31950
num_examples: 1140
- name: AutoCoder_S_6.7B
num_bytes: 31950
num_examples: 1140
- name: AutoCoder_QW_7B
num_bytes: 31950
num_examples: 1140
- name: ReflectionCoder_DS_33B
num_bytes: 31950
num_examples: 1140
- name: ReflectionCoder_DS_6.7B
num_bytes: 31950
num_examples: 1140
- name: ReflectionCoder_CL_34B
num_bytes: 31950
num_examples: 1140
- name: ReflectionCoder_CL_7B
num_bytes: 31950
num_examples: 1140
download_size: 788180
dataset_size: 2715750
configs:
- config_name: default
data_files:
- split: Magicoder_S_DS_6.7B
path: data/Magicoder_S_DS_6.7B-*
- split: StarCoder2_15B_Instruct_v0.1
path: data/StarCoder2_15B_Instruct_v0.1-*
- split: StarCoder2_3B
path: data/StarCoder2_3B-*
- split: StarCoder2_7B
path: data/StarCoder2_7B-*
- split: StarCoder2_15B
path: data/StarCoder2_15B-*
- split: CodeQwen1.5_7B
path: data/CodeQwen1.5_7B-*
- split: CodeGemma_2B
path: data/CodeGemma_2B-*
- split: CodeGemma_7B
path: data/CodeGemma_7B-*
- split: CodeGemma_7B_Instruct
path: data/CodeGemma_7B_Instruct-*
- split: GPT_3.5_Turbo_0125
path: data/GPT_3.5_Turbo_0125-*
- split: GPT_4o_2024_05_13
path: data/GPT_4o_2024_05_13-*
- split: GPT_4_Turbo_2024_04_09
path: data/GPT_4_Turbo_2024_04_09-*
- split: GPT_4_0613
path: data/GPT_4_0613-*
- split: CodeLlama_7B_Base
path: data/CodeLlama_7B_Base-*
- split: CodeLlama_13B_Base
path: data/CodeLlama_13B_Base-*
- split: CodeLlama_7B_Instruct
path: data/CodeLlama_7B_Instruct-*
- split: CodeLlama_13B_Instruct
path: data/CodeLlama_13B_Instruct-*
- split: Mistral_Large_2402
path: data/Mistral_Large_2402-*
- split: Mistral_Small_2402
path: data/Mistral_Small_2402-*
- split: Mixtral_8x22B_Base
path: data/Mixtral_8x22B_Base-*
- split: Mixtral_8x22B_Instruct
path: data/Mixtral_8x22B_Instruct-*
- split: CodeLlama_34B_Base
path: data/CodeLlama_34B_Base-*
- split: CodeLlama_34B_Instruct
path: data/CodeLlama_34B_Instruct-*
- split: CodeLlama_70B_Base
path: data/CodeLlama_70B_Base-*
- split: CodeLlama_70B_Instruct
path: data/CodeLlama_70B_Instruct-*
- split: CodeQwen1.5_7B_Chat
path: data/CodeQwen1.5_7B_Chat-*
- split: Qwen1.5_110B_Chat
path: data/Qwen1.5_110B_Chat-*
- split: Qwen1.5_72B_Chat
path: data/Qwen1.5_72B_Chat-*
- split: Qwen1.5_32B_Chat
path: data/Qwen1.5_32B_Chat-*
- split: DeepSeek_V2_Chat
path: data/DeepSeek_V2_Chat-*
- split: DeepSeek_Coder_1.3B_Base
path: data/DeepSeek_Coder_1.3B_Base-*
- split: DeepSeek_Coder_1.3B_Instruct
path: data/DeepSeek_Coder_1.3B_Instruct-*
- split: DeepSeek_Coder_33B_Base
path: data/DeepSeek_Coder_33B_Base-*
- split: DeepSeek_Coder_33B_Instruct
path: data/DeepSeek_Coder_33B_Instruct-*
- split: DeepSeek_Coder_6.7B_Base
path: data/DeepSeek_Coder_6.7B_Base-*
- split: DeepSeek_Coder_6.7B_Instruct
path: data/DeepSeek_Coder_6.7B_Instruct-*
- split: Llama_3_70B_Base
path: data/Llama_3_70B_Base-*
- split: Llama_3_70B_Instruct
path: data/Llama_3_70B_Instruct-*
- split: Llama_3_8B_Base
path: data/Llama_3_8B_Base-*
- split: Llama_3_8B_Instruct
path: data/Llama_3_8B_Instruct-*
- split: Granite_Code_3B_Instruct
path: data/Granite_Code_3B_Instruct-*
- split: Granite_Code_8B_Instruct
path: data/Granite_Code_8B_Instruct-*
- split: Granite_Code_20B_Instruct
path: data/Granite_Code_20B_Instruct-*
- split: Granite_Code_34B_Instruct
path: data/Granite_Code_34B_Instruct-*
- split: Granite_Code_3B_Base
path: data/Granite_Code_3B_Base-*
- split: Granite_Code_8B_Base
path: data/Granite_Code_8B_Base-*
- split: Granite_Code_20B_Base
path: data/Granite_Code_20B_Base-*
- split: Granite_Code_34B_Base
path: data/Granite_Code_34B_Base-*
- split: Claude_3_Haiku_20240307
path: data/Claude_3_Haiku_20240307-*
- split: Claude_3_Sonnet_20240229
path: data/Claude_3_Sonnet_20240229-*
- split: Claude_3_Opus_20240229
path: data/Claude_3_Opus_20240229-*
- split: Yi_1.5_34B_Chat
path: data/Yi_1.5_34B_Chat-*
- split: Yi_1.5_34B
path: data/Yi_1.5_34B-*
- split: Yi_1.5_9B_Chat
path: data/Yi_1.5_9B_Chat-*
- split: Yi_1.5_9B
path: data/Yi_1.5_9B-*
- split: Yi_1.5_6B_Chat
path: data/Yi_1.5_6B_Chat-*
- split: Yi_1.5_6B
path: data/Yi_1.5_6B-*
- split: Qwen2_57B_A14B
path: data/Qwen2_57B_A14B-*
- split: Qwen2_7B_Instruct
path: data/Qwen2_7B_Instruct-*
- split: Qwen2_72B_Chat
path: data/Qwen2_72B_Chat-*
- split: Gemini_1.5_Pro_API_0514
path: data/Gemini_1.5_Pro_API_0514-*
- split: Gemini_1.5_Flash_API_0514
path: data/Gemini_1.5_Flash_API_0514-*
- split: OpenCodeInterpreter_DS_33B
path: data/OpenCodeInterpreter_DS_33B-*
- split: OpenCodeInterpreter_DS_6.7B
path: data/OpenCodeInterpreter_DS_6.7B-*
- split: OpenCodeInterpreter_DS_1.3B
path: data/OpenCodeInterpreter_DS_1.3B-*
- split: Phi_3_medium_128k_instruct
path: data/Phi_3_medium_128k_instruct-*
- split: Phi_3_small_128k_instruct
path: data/Phi_3_small_128k_instruct-*
- split: Codestral_22B_v0.1
path: data/Codestral_22B_v0.1-*
- split: Mistral_7B_Instruct_v0.3
path: data/Mistral_7B_Instruct_v0.3-*
- split: Mistral_7B_v0.3
path: data/Mistral_7B_v0.3-*
- split: Command_R_plus
path: data/Command_R_plus-*
- split: DeepSeek_Coder_V2_Instruct
path: data/DeepSeek_Coder_V2_Instruct-*
- split: DeepSeek_Coder_V2_Lite_Instruct
path: data/DeepSeek_Coder_V2_Lite_Instruct-*
- split: DeepSeek_Coder_V2_Lite_Base
path: data/DeepSeek_Coder_V2_Lite_Base-*
- split: Claude_3.5_Sonnet_20240620
path: data/Claude_3.5_Sonnet_20240620-*
- split: Hermes_2_Theta_Llama_3_70B
path: data/Hermes_2_Theta_Llama_3_70B-*
- split: WaveCoder_Ultra_6.7B
path: data/WaveCoder_Ultra_6.7B-*
- split: Gemma_2_9B_Instruct
path: data/Gemma_2_9B_Instruct-*
- split: AutoCoder
path: data/AutoCoder-*
- split: AutoCoder_S_6.7B
path: data/AutoCoder_S_6.7B-*
- split: AutoCoder_QW_7B
path: data/AutoCoder_QW_7B-*
- split: ReflectionCoder_DS_33B
path: data/ReflectionCoder_DS_33B-*
- split: ReflectionCoder_DS_6.7B
path: data/ReflectionCoder_DS_6.7B-*
- split: ReflectionCoder_CL_34B
path: data/ReflectionCoder_CL_34B-*
- split: ReflectionCoder_CL_7B
path: data/ReflectionCoder_CL_7B-*
---
提供机构:
bigcode
原始信息汇总
数据集概述
数据集信息
-
特征:
task_id: 字符串类型status: 64位整数类型
-
分割:
Magicoder_S_DS_6.7B: 31950字节,1140个样本StarCoder2_15B_Instruct_v0.1: 31950字节,1140个样本StarCoder2_3B: 31950字节,1140个样本StarCoder2_7B: 31950字节,1140个样本StarCoder2_15B: 31950字节,1140个样本CodeQwen1.5_7B: 31950字节,1140个样本CodeGemma_2B: 31950字节,1140个样本CodeGemma_7B: 31950字节,1140个样本CodeGemma_7B_Instruct: 31950字节,1140个样本GPT_3.5_Turbo_0125: 31950字节,1140个样本GPT_4o_2024_05_13: 31950字节,1140个样本GPT_4_Turbo_2024_04_09: 31950字节,1140个样本GPT_4_0613: 31950字节,1140个样本CodeLlama_7B_Base: 31950字节,1140个样本CodeLlama_13B_Base: 31950字节,1140个样本CodeLlama_7B_Instruct: 31950字节,1140个样本CodeLlama_13B_Instruct: 31950字节,1140个样本Mistral_Large_2402: 31950字节,1140个样本Mistral_Small_2402: 31950字节,1140个样本Mixtral_8x22B_Base: 31950字节,1140个样本Mixtral_8x22B_Instruct: 31950字节,1140个样本CodeLlama_34B_Base: 31950字节,1140个样本CodeLlama_34B_Instruct: 31950字节,1140个样本CodeLlama_70B_Base: 31950字节,1140个样本CodeLlama_70B_Instruct: 31950字节,1140个样本CodeQwen1.5_7B_Chat: 31950字节,1140个样本Qwen1.5_110B_Chat: 31950字节,1140个样本Qwen1.5_72B_Chat: 31950字节,1140个样本Qwen1.5_32B_Chat: 31950字节,1140个样本DeepSeek_V2_Chat: 31950字节,1140个样本DeepSeek_Coder_1.3B_Base: 31950字节,1140个样本DeepSeek_Coder_1.3B_Instruct: 31950字节,1140个样本DeepSeek_Coder_33B_Base: 31950字节,1140个样本DeepSeek_Coder_33B_Instruct: 31950字节,1140个样本DeepSeek_Coder_6.7B_Base: 31950字节,1140个样本DeepSeek_Coder_6.7B_Instruct: 31950字节,1140个样本Llama_3_70B_Base: 31950字节,1140个样本Llama_3_70B_Instruct: 31950字节,1140个样本Llama_3_8B_Base: 31950字节,1140个样本Llama_3_8B_Instruct: 31950字节,1140个样本Granite_Code_3B_Instruct: 31950字节,1140个样本Granite_Code_8B_Instruct: 31950字节,1140个样本Granite_Code_20B_Instruct: 31950字节,1140个样本Granite_Code_34B_Instruct: 31950字节,1140个样本Granite_Code_3B_Base: 31950字节,1140个样本Granite_Code_8B_Base: 31950字节,1140个样本Granite_Code_20B_Base: 31950字节,1140个样本Granite_Code_34B_Base: 31950字节,1140个样本Claude_3_Haiku_20240307: 31950字节,1140个样本Claude_3_Sonnet_20240229: 31950字节,1140个样本Claude_3_Opus_20240229: 31950字节,1140个样本Yi_1.5_34B_Chat: 31950字节,1140个样本Yi_1.5_34B: 31950字节,1140个样本Yi_1.5_9B_Chat: 31950字节,1140个样本Yi_1.5_9B: 31950字节,1140个样本Yi_1.5_6B_Chat: 31950字节,1140个样本Yi_1.5_6B: 31950字节,1140个样本Qwen2_57B_A14B: 31950字节,1140个样本Qwen2_7B_Instruct: 31950字节,1140个样本Qwen2_72B_Chat: 31950字节,1140个样本Gemini_1.5_Pro_API_0514: 31950字节,1140个样本Gemini_1.5_Flash_API_0514: 31950字节,1140个样本OpenCodeInterpreter_DS_33B: 31950字节,1140个样本OpenCodeInterpreter_DS_6.7B: 31950字节,1140个样本OpenCodeInterpreter_DS_1.3B: 31950字节,1140个样本Phi_3_medium_128k_instruct: 31950字节,1140个样本Phi_3_small_128k_instruct: 31950字节,1140个样本Codestral_22B_v0.1: 31950字节,1140个样本Mistral_7B_Instruct_v0.3: 31950字节,1140个样本Mistral_7B_v0.3: 31950字节,1140个样本Command_R_plus: 31950字节,1140个样本DeepSeek_Coder_V2_Instruct: 31950字节,1140个样本DeepSeek_Coder_V2_Lite_Instruct: 31950字节,1140个样本DeepSeek_Coder_V2_Lite_Base: 31950字节,1140个样本Claude_3.5_Sonnet_20240620: 31950字节,1140个样本Hermes_2_Theta_Llama_3_70B: 31950字节,1140个样本WaveCoder_Ultra_6.7B: 31950字节,1140个样本Gemma_2_9B_Instruct: 31950字节,1140个样本AutoCoder: 31950字节,1140个样本AutoCoder_S_6.7B: 31950字节,1140个样本AutoCoder_QW_7B: 31950字节,1140个样本ReflectionCoder_DS_33B: 31950字节,1140个样本ReflectionCoder_DS_6.7B: 31950字节,1140个样本ReflectionCoder_CL_34B: 31950字节,1140个样本ReflectionCoder_CL_7B: 31950字节,1140个样本
-
下载大小: 788180字节
-
数据集大小: 2715750字节
配置
- 配置名称:
default- 数据文件:
Magicoder_S_DS_6.7B:data/Magicoder_S_DS_6.7B-*StarCoder2_15B_Instruct_v0.1:data/StarCoder2_15B_Instruct_v0.1-*StarCoder2_3B:data/StarCoder2_3B-*StarCoder2_7B:data/StarCoder2_7B-*StarCoder2_15B:data/StarCoder2_15B-*CodeQwen1.5_7B:data/CodeQwen1.5_7B-*CodeGemma_2B:data/CodeGemma_2B-*CodeGemma_7B:data/CodeGemma_7B-*CodeGemma_7B_Instruct:data/CodeGemma_7B_Instruct-*GPT_3.5_Turbo_0125:data/GPT_3.5_Turbo_0125-*GPT_4o_2024_05_13:data/GPT_4o_2024_05_13-*GPT_4_Turbo_2024_04_09:data/GPT_4_Turbo_2024_04_09-*GPT_4_0613:data/GPT_4_0613-*CodeLlama_7B_Base:data/CodeLlama_7B_Base-*CodeLlama_13B_Base:data/CodeLlama_13B_Base-*CodeLlama_7B_Instruct:data/CodeLlama_7B_Instruct-*CodeLlama_13B_Instruct:data/CodeLlama_13B_Instruct-*Mistral_Large_2402:data/Mistral_Large_2402-*Mistral_Small_2402:data/Mistral_Small_2402-*Mixtral_8x22B_Base:data/Mixtral_8x22B_Base-*Mixtral_8x22B_Instruct:data/Mixtral_8x22B_Instruct-*CodeLlama_34B_Base:data/CodeLlama_34B_Base-*CodeLlama_34B_Instruct:data/CodeLlama_34B_Instruct-*CodeLlama_70B_Base:data/CodeLlama_70B_Base-*CodeLlama_70B_Instruct:data/CodeLlama_70B_Instruct-*CodeQwen1.5_7B_Chat:data/CodeQwen1.5_7B_Chat-*Qwen1.5_110B_Chat:data/Qwen1.5_110B_Chat-*Qwen1.5_72B_Chat:data/Qwen1.5_72B_Chat-*Qwen1.5_32B_Chat:data/Qwen1.5_32B_Chat-*DeepSeek_V2_Chat:data/DeepSeek_V2_Chat-*DeepSeek_Coder_1.3B_Base:data/DeepSeek_Coder_1.3B_Base-*DeepSeek_Coder_1.3B_Instruct:data/DeepSeek_Coder_1.3B_Instruct-*DeepSeek_Coder_33B_Base:data/DeepSeek_Coder_33B_Base-*DeepSeek_Coder_33B_Instruct:data/DeepSeek_Coder_33B_Instruct-*DeepSeek_Coder_6.7B_Base:data/DeepSeek_Coder_6.7B_Base-*DeepSeek_Coder_6.7B_Instruct:data/DeepSeek_Coder_6.7B_Instruct-*Llama_3_70B_Base:data/Llama_3_70B_Base-*Llama_3_70B_Instruct:data/Llama_3_70B_Instruct-*Llama_3_8B_Base:data/Llama_3_8B_Base-*Llama_3_8B_Instruct:data/Llama_3_8B_Instruct-*Granite_Code_3B_Instruct:data/Granite_Code_3B_Instruct-*Granite_Code_8B_Instruct:data/Granite_Code_8B_Instruct-*Granite_Code_20B_Instruct:data/Granite_Code_20B_Instruct-*Granite_Code_34B_Instruct:data/Granite_Code_34B_Instruct-*Granite_Code_3B_Base:data/Granite_Code_3B_Base-*Granite_Code_8B_Base:data/Granite_Code_8B_Base-*Granite_Code_20B_Base: `data/Granite_Code_2
- 数据文件:
搜集汇总
数据集介绍

构建方式
在代码生成与智能编程领域,评估模型的性能至关重要。BigCodeBench-Complete-Perf数据集的构建,源于对多样化代码生成模型进行系统性评测的需求。该数据集通过集成BigCodeBench基准测试中的1140个编程任务,并针对每个任务收集了来自超过80种不同代码生成模型的执行结果。这些模型涵盖了从开源基础模型到商业API的广泛谱系,包括不同参数规模与架构变体。数据集的构建过程涉及统一的任务定义、标准化的执行环境配置以及自动化的结果收集与验证,确保了评测的一致性与可复现性。
特点
该数据集的核心特征在于其前所未有的模型覆盖广度与评测深度。它不仅囊括了如CodeLlama、StarCoder、DeepSeek-Coder等主流开源代码模型及其指令微调版本,还整合了GPT、Claude、Gemini等前沿商业模型的多代产品。数据集为每个模型在全部1140个任务上的表现提供了统一的性能度量,形成了跨模型的横向对比矩阵。这种结构使得研究者能够深入分析模型规模、训练策略、架构差异与代码生成能力之间的复杂关联,为模型选择与能力评估提供了多维度的参考依据。
使用方法
对于致力于代码智能研究的学者与开发者而言,该数据集提供了便捷的模型性能分析入口。用户可通过HuggingFace平台直接加载数据集,其结构以模型名称作为数据分割(split),每个分割下包含所有任务ID及对应的执行状态。典型的使用场景包括:直接查询特定模型在所有任务上的表现以评估其综合能力;对比不同模型家族或同系列不同规模变体之间的性能差异;或基于执行结果进行更深层次的统计分析,例如探究任务难度分布与模型失败模式。数据集的标准格式确保了其能无缝集成到现有的数据分析与可视化工作流中。
背景与挑战
背景概述
在代码生成领域,随着大型语言模型的迅猛发展,评估模型在复杂编程任务上的性能成为推动技术进步的关键。BigCodeBench-Complete-Perf数据集应运而生,由BigCode社区于2024年创建,旨在系统性地评估多种代码生成模型在完整编程任务上的表现。该数据集的核心研究问题聚焦于如何量化模型在解决实际编程问题时的准确性与效率,涵盖了从基础代码补全到复杂算法实现的广泛任务。通过集成包括GPT系列、Claude、Llama、CodeLlama、DeepSeek-Coder等在内的超过70个前沿模型评估结果,该数据集为研究人员提供了统一的基准平台,显著促进了代码智能领域的模型比较与优化研究,对推动自动化编程工具的发展具有深远影响。
当前挑战
该数据集致力于解决代码生成领域模型性能评估的标准化挑战,具体包括模型在多样化编程任务中输出代码的功能正确性、逻辑严谨性以及执行效率的量化难题。在构建过程中,面临多重挑战:首先,需要设计涵盖不同难度与编程范式的任务集合,确保评估的全面性与公平性;其次,集成众多异构模型的输出结果涉及复杂的数据清洗与格式统一工作,以保障数据的一致性与可比性;此外,自动化评估框架的建立需克服代码执行环境的安全隔离、结果验证的准确性以及大规模并行测试的资源管理等问题。这些挑战共同构成了数据集构建与应用的复杂性。
常用场景
经典使用场景
在代码生成与智能编程领域,BigCodeBench-Complete-Perf数据集作为一项基准测试资源,其经典使用场景聚焦于评估大规模代码语言模型的性能表现。该数据集通过整合多种主流模型在相同任务上的执行结果,为研究者提供了横向比较不同架构、参数规模和训练策略的模型在代码生成任务中的准确性与效率。这种系统性的评估框架,使得学术界能够深入探究模型在复杂编程逻辑、多语言支持及代码优化方面的能力边界,从而推动代码智能技术的迭代与优化。
实际应用
在实际应用层面,BigCodeBench-Complete-Perf数据集为工业界选择与部署代码生成模型提供了关键参考。企业可根据数据集中的性能指标,筛选出在特定编程语言、代码复杂度或资源约束下表现最优的模型,集成到开发工具链、自动化编程助手或教育平台中。例如,在软件开发生命周期中,该数据有助于优化代码审查、智能补全和缺陷检测等环节,提升开发效率与代码质量,推动智能化编程工具在产业中的落地与普及。
衍生相关工作
围绕该数据集,已衍生出多项经典研究工作,主要集中在模型性能分析与优化策略探索。例如,基于数据集中不同模型变体的对比分析,研究者提出了针对代码生成的微调方法、多任务学习框架及效率提升技术。这些工作不仅深化了对代码生成模型内在机制的理解,还催生了如反射式编码、自适应推理等创新方向,为后续开发更高效、鲁棒的代码智能系统奠定了理论基础与实践范例。
以上内容由遇见数据集搜集并总结生成



