OSS-forge/HumanVsAICode

Name: OSS-forge/HumanVsAICode
Creator: OSS-forge
Published: 2025-12-17 12:43:31
License: 暂无描述

Hugging Face2025-12-17 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/OSS-forge/HumanVsAICode

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是一个大规模的人类编写和大型语言模型生成的代码集合，旨在研究人类开发者和现代AI代码助手在缺陷分布、代码质量和安全特性方面的差异。数据集包含Python和Java这两种广泛采用的编程语言的相同函数的成对实现，这两种语言具有不同的类型系统、范式和软件工程实践。数据集支持代码质量分析、安全代码生成、漏洞检测、软件工程、程序分析以及评估用于代码的大型语言模型的研究。

This dataset is a large-scale collection of human-written and LLM-generated code designed to study differences in defect distribution, code quality, and security characteristics between human developers and modern AI code assistants. It contains paired implementations of the same function across multiple authorship sources, spanning Python and Java, two widely adopted programming languages with distinct typing systems, paradigms, and software engineering practices. The dataset supports research in code quality analysis, secure code generation, vulnerability detection, software engineering, program analysis, and evaluation of large language models for code.

提供机构：

OSS-forge

5,000+

优质数据集

54 个

任务类型

进入经典数据集