ByteDance/FullStackBench
收藏Hugging Face2024-12-04 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/ByteDance/FullStackBench
下载链接
链接失效反馈官方服务:
资源简介:
FullStack Bench是一个多语言的全栈编程基准测试数据集,涵盖了广泛的应用领域和16种编程语言,包含3K个测试样本。该数据集旨在推动代码大语言模型在真实世界代码开发场景中的能力。数据集还包含SandboxFusion,一个用于评估不同编程任务的代码沙箱执行工具。
FullStack Bench is a multilingual benchmark dataset designed to evaluate the capabilities of large language models (LLMs) as full-stack programmers. The dataset covers a wide range of application domains and 16 programming languages, containing 3,000 test samples, aiming to push the limits of code LLMs in code-related abilities in real-world code development scenarios. The dataset includes two configurations, English and Chinese, each containing multiple features such as canonical_solution, content, id, labels, and a test set with various file types.
提供机构:
ByteDance
搜集汇总
数据集介绍

背景与挑战
背景概述
FullStack Bench是一个多语言全栈编程基准测试数据集,涵盖16种编程语言和3K个测试样本,旨在评估代码大语言模型在真实开发场景中的能力。数据集还包含SandboxFusion工具,用于评估不同编程任务的代码执行。
以上内容由遇见数据集搜集并总结生成



