NuclearAi/Nuke-X-Glaive-Python-Dataset
收藏Hugging Face2024-06-04 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/NuclearAi/Nuke-X-Glaive-Python-Dataset
下载链接
链接失效反馈官方服务:
资源简介:
---
size_categories:
- 100K<n<1M
dataset_info:
features:
- name: instruction
dtype: string
- name: input
dtype: string
- name: output
dtype: string
splits:
- name: train
num_bytes: 421203804
num_examples: 240888
download_size: 203231858
dataset_size: 421203804
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
task_categories:
- question-answering
- text-generation
- text2text-generation
tags:
- python
- synthetic
- code
license: apache-2.0
language:
- en
---
We're excited to announce the release of the [NuclearAi/Nuke-X-Glaive-Python-Dataset](https://huggingface.co/datasets/NuclearAi/Nuke-X-Glaive-Python-Dataset/), a comprehensive **Collection of over 240,888 unique lines of Python Code** sourced from public datasets. This dataset is specifically designed for fine-tuning and training LLMs to achieve exceptional accuracy in Python language understanding and generation.
提供机构:
NuclearAi
原始信息汇总
数据集概述
基本信息
- 大小范围: 100K<n<1M
- 语言: 英语 (en)
- 许可证: Apache-2.0
数据集特征
- 特征名称: instruction, input, output
- 数据类型: 均为字符串 (string)
数据集划分
- 训练集:
- 示例数量: 240,888
- 数据大小: 421,203,804 字节
- 下载大小: 203,231,858 字节
任务类别
- 问答 (question-answering)
- 文本生成 (text-generation)
- 文本到文本生成 (text2text-generation)
标签
- Python
- 合成数据 (synthetic)
- 代码 (code)



