joshuasundance/mypo-4k-rfc
收藏Hugging Face2024-07-14 更新2024-07-13 收录
下载链接:
https://hf-mirror.com/datasets/joshuasundance/mypo-4k-rfc
下载链接
链接失效反馈官方服务:
资源简介:
`mypo`数据集是一个专注于Python代码质量的DPO(Direct Preference Optimization)数据集。它包含三个主要列:`prompt`(原始提示)、`rejected`(存在linting错误的代码)和`chosen`(经过LLM修复后的代码)。数据集的目的是通过静态代码分析教导LLM在给定`prompt`时,选择`chosen`输出而非`rejected`输出,以提高代码生成质量。数据集来源于`iamtarun/python_code_instructions_18k_alpaca`,并通过`codellama/CodeLlama-7b-Python-hf`模型修复代码中的linting错误。
The `mypo` dataset is a preview version of a DPO dataset focused on Python code quality. It is derived from the `iamtarun/python_code_instructions_18k_alpaca` dataset and includes three main columns: `prompt` (from the original dataset), `rejected` (code from the original dataset with linting errors), and `chosen` (code rewritten by `codellama/CodeLlama-7b-Python-hf` to address linting errors). The dataset aims to train large language models (LLMs) to recognize and select the `chosen` output given the `prompt`, as it has better code quality than the `rejected` output. The creation process involves filtering out code with linting errors using static code analysis tools like `black`, `ruff`, and `mypy`, then rewriting these codes using LLM to correct the errors, and finally constructing the DPO dataset.
提供机构:
joshuasundance



