five

Arjun-G-Ravi/Python-codes

收藏
Hugging Face2023-08-12 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/Arjun-G-Ravi/Python-codes
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit task_categories: - text-generation - text2text-generation language: - en tags: - code pretty_name: Python codes dataset size_categories: - 10K<n<100K --- # Dataset Card for Dataset Name Please note that this dataset maynot be perfect and may contain a very small quantity of non python codes. But the quantity appears to be very small ### Dataset Summary The dataset contains a collection of python question and their code. This is meant to be used for training models to be efficient in Python specific coding. The dataset has two features - 'question' and 'code'. An example is: ``` {'question': 'Create a function that takes in a string and counts the number of vowels in it', 'code': 'def count_vowels(string):\n vowels = ["a", "e", "i", "o", "u"]\n count = 0\n for char in string:\n if char in vowels:\n count += 1\n return count'} ``` ### Languages English, Python ### Source Data The dataset is derived from two other coding based datasets: 1) sahil2801/CodeAlpaca-20k 2) neulab/conala @inproceedings{yin2018learning, title={Learning to mine aligned code and natural language pairs from stack overflow}, author={Yin, Pengcheng and Deng, Bowen and Chen, Edgar and Vasilescu, Bogdan and Neubig, Graham}, booktitle={2018 IEEE/ACM 15th international conference on mining software repositories (MSR)}, pages={476--486}, year={2018}, organization={IEEE} } ### Licensing Information This uses MIT licence ### Citation Information Will be added soon
提供机构:
Arjun-G-Ravi
原始信息汇总

数据集卡片

数据集概述

该数据集包含一系列Python问题及其对应的代码,旨在用于训练模型以提高Python编程效率。数据集包含两个特征:question(问题)和code(代码)。

示例: json { "question": "Create a function that takes in a string and counts the number of vowels in it", "code": "def count_vowels(string): vowels = ["a", "e", "i", "o", "u"] count = 0 for char in string: if char in vowels: count += 1 return count" }

语言

英语,Python

源数据

该数据集源自两个其他编程相关数据集:

  1. sahil2801/CodeAlpaca-20k
  2. neulab/conala

许可信息

该数据集使用MIT许可证。

引用信息

引用信息将很快添加。

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作