KrisPi/PythonTutor-Evol-1k-DPO-GPT4_vs_35

Name: KrisPi/PythonTutor-Evol-1k-DPO-GPT4_vs_35
Creator: KrisPi
Published: 2023-11-18 19:32:35
License: 暂无描述

Hugging Face2023-11-18 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/KrisPi/PythonTutor-Evol-1k-DPO-GPT4_vs_35

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: cc-by-nc-sa-4.0 language: - en size_categories: - n<1K --- Started with: https://huggingface.co/datasets/nickrosh/Evol-Instruct-Code-80k-v1 (GPT-3.5 Turbo) Randomly selected 1000 where output contained "```python" in output Generated GPT-4 answers to those for the sake of LIMA-like "Python Tutor" Instruct fine-tuning as well as validate DPO Fine-Tuning (where GPT-4 answers will be preferred to GPT-3.5 Turbo) Then filtered refusals (looking for "impossible" or "sorry") GPT-4 System Prompt: You are an intelligent assistant that generates Python code. Start generation with ```python and end with ``` and nothing else. Just content between ```python and ```. The generated code should be wrapped in triple backticks and language identifier. Each line of code should be accompanied by a comment explaining it, and every function definition should be followed by a docstring describing the function, solution approach, and any edge cases considered. Try to wrap code in a function.

提供机构：

KrisPi

原始信息汇总

数据集概述

许可证

该数据集遵循 CC BY-NC-SA 4.0 许可证。

语言

数据集主要使用英语。

数据规模

数据集规模为 n<1K。

数据来源与处理

数据集起始于 nickrosh/Evol-Instruct-Code-80k-v1。
从原始数据集中随机选择了1000条输出包含 python 的样本。
使用 GPT-4 生成这些样本的答案，用于类似 LIMA 的 "Python Tutor" 指令微调以及验证 DPO 微调（其中 GPT-4 的答案将优于 GPT-3.5 Turbo）。
过滤了包含 "impossible" 或 "sorry" 的拒绝响应。

GPT-4 系统提示

作为智能助手生成 Python 代码，代码应包含在 python 和 `` 之间，不包含其他内容。
生成的代码应使用三重反引号和语言标识符包裹。
每行代码应附带注释解释，每个函数定义后应跟随描述函数、解决方案和考虑的边缘情况的文档字符串。
尽量将代码封装在函数中。

5,000+

优质数据集

54 个

任务类型

进入经典数据集