CodeExercise-Python-27k
收藏魔搭社区2026-05-16 更新2024-05-15 收录
下载链接:
https://modelscope.cn/datasets/codefuse-ai/CodeExercise-Python-27k
下载链接
链接失效反馈官方服务:
资源简介:
# Dataset Card for CodeFuse-CodeExercise-Python-27k
<div align='center'>
<p align="center">
<img src="https://modelscope.cn/api/v1/models/codefuse-ai/CodeFuse-QWen-14B/repo?Revision=master&FilePath=LOGO.jpg&View=true" width="800"/>
<p>
[[中文]](#chinese) [[English]](#english)
</div>
<a id="english"></a>
### Dataset Description
This dataset consists of 27K Python programming exercises (in English), covering hundreds of Python-related topics including basic syntax and data structures, algorithm applications, database queries, machine learning, and more.
*Please note that this dataset was generated with the help of a teacher model and [Camel](https://arxiv.org/abs/2303.17760), and has not undergone strict validation. There may be errors or semantic duplicates in the questions or answers, so please use it with caution.*
## Field Structure
```
{
"dataset": #Dataset name,
"id": #Sample ID,
"chat_rounds": [
{
"role": "human",
"content": #instruction,
"chat_round_id": 0
},
{
"role": "bot",
"content": #output,
"chat_round_id": 1
}]
}
```
### Use
```python
from modelscope import MsDataset
import ast
dataset = MsDataset.load('codefuse-ai/CodeExercise-Python-27k').to_hf_dataset()
print(dataset)
print(ast.literal_eval(dataset[0]['chat_rounds']))
"""Out[0]
Dataset({
features: ['data_name', 'id', 'chat_rounds'],
num_rows: 27224
})
...
"""
```
### Examples
```
{
"data_name": "CodeExercise-Py",
"id": 27219,
"chat_rounds": [
{
"role": "human",
"content": "Write a Python program to rotate an array by a given number of steps.",
"chat_round_id": 0
},
{
"role": "bot",
"content": "Sure! Here's a Python program that rotates an array by a given number of steps.\n\n```python\ndef rotate_array(arr, steps):\n \"\"\"\n Rotate an array by the given number of steps.\n\n Parameters:\n arr (list): The input array.\n steps (int): The number of steps to rotate the array. Positive steps rotate it to the right,\n while negative steps rotate it to the left.\n\n Returns:\n list: The rotated array.\n\n Examples:\n >>> rotate_array([1, 2, 3, 4, 5], 2)\n [4, 5, 1, 2, 3]\n\n >>> rotate_array([1, 2, 3, 4, 5], -1)\n [2, 3, 4, 5, 1]\n \"\"\"\n arr_length = len(arr)\n steps = steps % arr_length # Handle steps larger than array length\n\n if steps == 0: # No rotation needed\n return arr\n\n rotated_arr = arr[-steps:] + arr[:-steps] # Split the list and concatenate in the desired order\n\n return rotated_arr\n\n\n# Example usage\narr = [1, 2, 3, 4, 5]\nsteps = 2\nrotated_arr = rotate_array(arr, steps)\nprint(rotated_arr)\n```\n\nThe `rotate_array()` function takes an input array `arr` and the number of steps `steps` to rotate the array. It first calculates the actual number of rotations by performing a modulo operation on `steps` with the length of the input array. This ensures that even if `steps` is larger than the array length, we always perform the appropriate number of rotations.\n\nIf `steps` is 0, the function returns the input array as no rotation is needed.\n\nOtherwise, the function splits the input array into two parts based on the number of steps: the last `steps` elements (using negative indexing) and the remaining elements. These two parts are then concatenated in the desired order to obtain the rotated array.\n\nFinally, the program demonstrates an example usage by rotating an array `[1, 2, 3, 4, 5]` by 2 steps (to the right) and printing the resulting rotated array.",
"chat_round_id": 1
}]
}
```
## Creation

* Step 1: Curate a seed set of Python knowledge points.
* Step 2: Embed each seed into a fixed task template to obtain a fixed template's "Task Prompt," which serves as a prompt for a teacher model to generate exercise questions related to the given knowledge point.
* Step 3: Use Camel to refine the "Task Prompt" obtained in step 2, in order to achieve more accurate and diverse descriptions.
* Step 4: Input the obtained Task Prompt into a teacher model to generate exercise questions (instructions) corresponding to the knowledge point.
* Step 5: For each exercise question (instruction), leverage the teacher model to generate the corresponding answer.
* Step 6: Assemble each question with its answer and remove duplicates.
<a id="chinese"></a>
## 简介
该数据集由2.7万道Python编程练习题(英文)组成,覆盖基础语法与数据结构、算法应用、数据库查询、机器学习等数百个Python相关知识点。
注意:该数据集是借助教师模型和[Camel](https://arxiv.org/abs/2303.17760)生成,未经严格校验,题目或答案可能存在错误或语义重复,使用时请注意。
## 数据字段
```
{
"dataset": #数据集名称,
"id": #样本ID,
"chat_rounds": [
{
"role": "human",
"content": #指令内容,
"chat_round_id": 0
},
{
"role": "bot",
"content": #答案内容,
"chat_round_id": 1
}]
}
```
### 样例
```
{
"data_name": "CodeExercise-Py",
"id": 27219,
"chat_rounds": [
{
"role": "human",
"content": "Write a Python program to rotate an array by a given number of steps.",
"chat_round_id": 0
},
{
"role": "bot",
"content": "Sure! Here's a Python program that rotates an array by a given number of steps.\n\n```python\ndef rotate_array(arr, steps):\n \"\"\"\n Rotate an array by the given number of steps.\n\n Parameters:\n arr (list): The input array.\n steps (int): The number of steps to rotate the array. Positive steps rotate it to the right,\n while negative steps rotate it to the left.\n\n Returns:\n list: The rotated array.\n\n Examples:\n >>> rotate_array([1, 2, 3, 4, 5], 2)\n [4, 5, 1, 2, 3]\n\n >>> rotate_array([1, 2, 3, 4, 5], -1)\n [2, 3, 4, 5, 1]\n \"\"\"\n arr_length = len(arr)\n steps = steps % arr_length # Handle steps larger than array length\n\n if steps == 0: # No rotation needed\n return arr\n\n rotated_arr = arr[-steps:] + arr[:-steps] # Split the list and concatenate in the desired order\n\n return rotated_arr\n\n\n# Example usage\narr = [1, 2, 3, 4, 5]\nsteps = 2\nrotated_arr = rotate_array(arr, steps)\nprint(rotated_arr)\n```\n\nThe `rotate_array()` function takes an input array `arr` and the number of steps `steps` to rotate the array. It first calculates the actual number of rotations by performing a modulo operation on `steps` with the length of the input array. This ensures that even if `steps` is larger than the array length, we always perform the appropriate number of rotations.\n\nIf `steps` is 0, the function returns the input array as no rotation is needed.\n\nOtherwise, the function splits the input array into two parts based on the number of steps: the last `steps` elements (using negative indexing) and the remaining elements. These two parts are then concatenated in the desired order to obtain the rotated array.\n\nFinally, the program demonstrates an example usage by rotating an array `[1, 2, 3, 4, 5]` by 2 steps (to the right) and printing the resulting rotated array.",
"chat_round_id": 1
}]
}
```
## 数据生成过程

* 第一步: 整理Python知识点,作为初始种子集
* 第二步:将每个种子嵌入到固定的任务模版中,获得固定模版的"Task Prompt",该任务模版的主题是提示教师模型生成给定知识点的练习题问题。
* 第三步:调用Camel对第二步获得的"Task Prompt"进行润色,以获得更加描述准确且多样的Task Prompt
* 第四步:将获得的Task Prompt输入给教师模型,令其生成对应知识点的练习题问题(指令)
* 第五步:对每个练习题问题(指令),借助教师模型生成对应的问题答案
* 第六步:组装每个问题和其答案,并进行去重操作
# CodeFuse-CodeExercise-Python-27k 数据集卡片
<div align='center'>
<p align="center">
<img src="https://modelscope.cn/api/v1/models/codefuse-ai/CodeFuse-QWen-14B/repo?Revision=master&FilePath=LOGO.jpg&View=true" width="800"/>
<p>
[[中文]](#chinese) [[English]](#english)
</div>
<a id="english"></a>
### 数据集概述
本数据集包含27000道英文Python编程练习题,覆盖数百个Python相关知识点,涵盖基础语法与数据结构、算法应用、数据库查询、机器学习等多个领域。
**注意**:本数据集借助教师模型与[Camel](https://arxiv.org/abs/2303.17760)生成,未经过严格校验,题目或答案可能存在错误或语义重复,使用时请谨慎。
### 数据字段结构
{
"dataset": # 数据集名称,
"id": # 样本ID,
"chat_rounds": [
{
"role": "human",
"content": # 指令内容,
"chat_round_id": 0
},
{
"role": "bot",
"content": # 答案内容,
"chat_round_id": 1
}]
}
### 使用方式
python
from modelscope import MsDataset
import ast
dataset = MsDataset.load('codefuse-ai/CodeExercise-Python-27k').to_hf_dataset()
print(dataset)
print(ast.literal_eval(dataset[0]['chat_rounds']))
"""输出结果
Out[0]
Dataset({
features: ['data_name', 'id', 'chat_rounds'],
样本数量: 27224
})
...
"""
### 数据样例
{
"data_name": "CodeExercise-Py",
"id": 27219,
"chat_rounds": [
{
"role": "human",
"content": "Write a Python program to rotate an array by a given number of steps.",
"chat_round_id": 0
},
{
"role": "bot",
"content": "Sure! Here's a Python program that rotates an array by a given number of steps.
python
def rotate_array(arr, steps):
"""
Rotate an array by the given number of steps.
Parameters:
arr (list): The input array.
steps (int): The number of steps to rotate the array. Positive steps rotate it to the right,
while negative steps rotate it to the left.
Returns:
list: The rotated array.
Examples:
>>> rotate_array([1, 2, 3, 4, 5], 2)
[4, 5, 1, 2, 3]
>>> rotate_array([1, 2, 3, 4, 5], -1)
[2, 3, 4, 5, 1]
"""
arr_length = len(arr)
steps = steps % arr_length # Handle steps larger than array length
if steps == 0: # No rotation needed
return arr
rotated_arr = arr[-steps:] + arr[:-steps] # Split the list and concatenate in the desired order
return rotated_arr
# Example usage
arr = [1, 2, 3, 4, 5]
steps = 2
rotated_arr = rotate_array(arr, steps)
print(rotated_arr)
The `rotate_array()` function takes an input array `arr` and the number of steps `steps` to rotate the array. It first calculates the actual number of rotations by performing a modulo operation on `steps` with the length of the input array. This ensures that even if `steps` is larger than the array length, we always perform the appropriate number of rotations.
If `steps` is 0, the function returns the input array as no rotation is needed.
Otherwise, the function splits the input array into two parts based on the number of steps: the last `steps` elements (using negative indexing) and the remaining elements. These two parts are then concatenated in the desired order to obtain the rotated array.
Finally, the program demonstrates an example usage by rotating an array `[1, 2, 3, 4, 5]` by 2 steps (to the right) and printing the resulting rotated array.",
"chat_round_id": 1
}]
}
### 数据集构建流程

* 步骤1:整理Python知识点,构建初始种子集。
* 步骤2:将每个种子知识点嵌入预设任务模板,生成“任务提示词(Task Prompt)”,该提示词用于指导教师模型生成对应知识点的编程练习题。
* 步骤3:调用Camel工具对步骤2生成的任务提示词进行润色,以获得表述更准确、形式更多样的提示词。
* 步骤4:将润色后的任务提示词输入教师模型,生成对应知识点的练习题指令。
* 步骤5:针对每一道练习题指令,借助教师模型生成对应的参考答案。
* 步骤6:将每道练习题与其参考答案进行组装,并执行去重操作。
<a id="chinese"></a>
## 简介
该数据集由2.7万道Python编程练习题(英文)组成,覆盖基础语法与数据结构、算法应用、数据库查询、机器学习等数百个Python相关知识点。
注意:该数据集是借助教师模型和[Camel](https://arxiv.org/abs/2303.17760)生成,未经严格校验,题目或答案可能存在错误或语义重复,使用时请注意。
## 数据字段
{
"dataset": #数据集名称,
"id": #样本ID,
"chat_rounds": [
{
"role": "human",
"content": #指令内容,
"chat_round_id": 0
},
{
"role": "bot",
"content": #答案内容,
"chat_round_id": 1
}]
}
### 样例
{
"data_name": "CodeExercise-Py",
"id": 27219,
"chat_rounds": [
{
"role": "human",
"content": "Write a Python program to rotate an array by a given number of steps.",
"chat_round_id": 0
},
{
"role": "bot",
"content": "Sure! Here's a Python program that rotates an array by a given number of steps.
python
def rotate_array(arr, steps):
"""
Rotate an array by the given number of steps.
Parameters:
arr (list): The input array.
steps (int): The number of steps to rotate the array. Positive steps rotate it to the right,
while negative steps rotate it to the left.
Returns:
list: The rotated array.
Examples:
>>> rotate_array([1, 2, 3, 4, 5], 2)
[4, 5, 1, 2, 3]
>>> rotate_array([1, 2, 3, 4, 5], -1)
[2, 3, 4, 5, 1]
"""
arr_length = len(arr)
steps = steps % arr_length # Handle steps larger than array length
if steps == 0: # No rotation needed
return arr
rotated_arr = arr[-steps:] + arr[:-steps] # Split the list and concatenate in the desired order
return rotated_arr
# Example usage
arr = [1, 2, 3, 4, 5]
steps = 2
rotated_arr = rotate_array(arr, steps)
print(rotated_arr)
The `rotate_array()` function takes an input array `arr` and the number of steps `steps` to rotate the array. It first calculates the actual number of rotations by performing a modulo operation on `steps` with the length of the input array. This ensures that even if `steps` is larger than the array length, we always perform the appropriate number of rotations.
If `steps` is 0, the function returns the input array as no rotation is needed.
Otherwise, the function splits the input array into two parts based on the number of steps: the last `steps` elements (using negative indexing) and the remaining elements. These two parts are then concatenated in the desired order to obtain the rotated array.
Finally, the program demonstrates an example usage by rotating an array `[1, 2, 3, 4, 5]` by 2 steps (to the right) and printing the resulting rotated array.",
"chat_round_id": 1
}]
}
## 数据生成过程

* 第一步: 整理Python知识点,作为初始种子集
* 第二步:将每个种子嵌入到固定的任务模版中,获得固定模版的"Task Prompt",该任务模版的主题是提示教师模型生成给定知识点的练习题问题。
* 第三步:调用Camel对第二步获得的"Task Prompt"进行润色,以获得更加描述准确且多样的Task Prompt
* 第四步:将获得的Task Prompt输入给教师模型,令其生成对应知识点的练习题问题(指令)
* 第五步:对每个练习题问题(指令),借助教师模型生成对应的问题答案
* 第六步:组装每个问题和其答案,并进行去重操作
提供机构:
maas
创建时间:
2023-09-11



