Arjun-G-Ravi/Python-codes
收藏Hugging Face2023-08-12 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/Arjun-G-Ravi/Python-codes
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- text-generation
- text2text-generation
language:
- en
tags:
- code
pretty_name: Python codes dataset
size_categories:
- 10K<n<100K
---
# Dataset Card for Dataset Name
Please note that this dataset maynot be perfect and may contain a very small quantity of non python codes. But the quantity appears to be very small
### Dataset Summary
The dataset contains a collection of python question and their code. This is meant to be used for training models to be efficient in Python specific coding.
The dataset has two features - 'question' and 'code'.
An example is:
```
{'question': 'Create a function that takes in a string and counts the number of vowels in it',
'code': 'def count_vowels(string):\n vowels = ["a", "e", "i", "o", "u"]\n count = 0\n for char in string:\n if char in vowels:\n count += 1\n return count'}
```
### Languages
English, Python
### Source Data
The dataset is derived from two other coding based datasets:
1) sahil2801/CodeAlpaca-20k
2) neulab/conala
@inproceedings{yin2018learning,
title={Learning to mine aligned code and natural language pairs from stack overflow},
author={Yin, Pengcheng and Deng, Bowen and Chen, Edgar and Vasilescu, Bogdan and Neubig, Graham},
booktitle={2018 IEEE/ACM 15th international conference on mining software repositories (MSR)},
pages={476--486},
year={2018},
organization={IEEE}
}
### Licensing Information
This uses MIT licence
### Citation Information
Will be added soon
提供机构:
Arjun-G-Ravi
原始信息汇总
数据集卡片
数据集概述
该数据集包含一系列Python问题及其对应的代码,旨在用于训练模型以提高Python编程效率。数据集包含两个特征:question(问题)和code(代码)。
示例: json { "question": "Create a function that takes in a string and counts the number of vowels in it", "code": "def count_vowels(string): vowels = ["a", "e", "i", "o", "u"] count = 0 for char in string: if char in vowels: count += 1 return count" }
语言
英语,Python
源数据
该数据集源自两个其他编程相关数据集:
- sahil2801/CodeAlpaca-20k
- neulab/conala
许可信息
该数据集使用MIT许可证。
引用信息
引用信息将很快添加。



