matlok/python-text-copilot-training-instruct-ai-research-2024-02-03
收藏数据集概述
数据集名称
2024-02-03 - python copilot instructions on how to code using alpaca and yaml
许可证
其他
数据集配置
- andromeda
- 分割:
- train
- test
- 数据文件:
- train:
train/train-0001-andromeda-andromeda_torch.parquet - test:
test/train-0002-andromeda-tests.parquet
- train:
- 分割:
- swarms
- 分割:
- train
- test
- 数据文件:
- train:
train/train-0004-swarms-swarms.parquet - test:
test/train-0005-swarms-tests.parquet
- train:
- 分割:
- swarms_pytorch
- 分割:
- train
- test
- 数据文件:
- train:
train/train-0006-swarms-pytorch-swarms_torch.parquet - test:
test/train-0007-swarms-pytorch-tests.parquet
- train:
- 分割:
- longnet
- 分割:
- train
- test
- 数据文件:
- train:
train/train-0009-longnet-long_net.parquet - test:
test/train-0010-longnet-tests.parquet
- train:
- 分割:
- zeta
- 分割:
- train
- test
- 数据文件:
- train:
train/train-0011-zeta-zeta.parquet - test:
test/train-0012-zeta-tests.parquet
- train:
- 分割:
数据集大小
1M < n < 10M
标签
- python-copilot
- python-coding
- python-architecture
- knowledge-graphs
- multimodal
- text-image-audio
- fine-tuning
- training
- question-answering
- image-knowledge-graph
- alpaca
- mp3
- png
- text
- instruct
- coding
- task
- prompt
- response
- yaml
支持的任务类别
- text-generation
- question-answering
支持的任务ID
- parsing
数据集详细信息
- 行数: 1182526
- 大小: 2.1 GB
- 数据类型: instruct
- 格式: 使用alpaca和yaml响应的代码使用介绍
- Python仓库数量: 1258
数据集加载示例
-
加载Andromeda训练/测试集 python from datasets import load_dataset ds = load_dataset("matlok/python-text-copilot-training-instruct-ai-research-2024-02-03", "andromeda", verification_mode="no_checks")
-
加载Swarms训练/测试集 python from datasets import load_dataset ds = load_dataset("matlok/python-text-copilot-training-instruct-ai-research-2024-02-03", "swarms", verification_mode="no_checks")
-
加载Swarms Pytorch训练/测试集 python from datasets import load_dataset ds = load_dataset("matlok/python-text-copilot-training-instruct-ai-research-2024-02-03", "swarms_pytorch", verification_mode="no_checks")
-
加载LongNet训练/测试集 python from datasets import load_dataset ds = load_dataset("matlok/python-text-copilot-training-instruct-ai-research-2024-02-03", "longnet", verification_mode="no_checks")
-
加载Zeta训练/测试集 python from datasets import load_dataset ds = load_dataset("matlok/python-text-copilot-training-instruct-ai-research-2024-02-03", "zeta", verification_mode="no_checks")
数据集模式
- desc 列包含指令alpaca文本和yaml响应 json { "active": "bool", "args": "string", "args_len": "float64", "audio_file": "string", "audio_path": "string", "class_bases": "string", "class_name": "string", "code": "string", "code_len": "float64", "desc": "string", "desc_docstr": "string", "desc_docstr_len": "float64", "desc_len": "int64", "docstr": "string", "docstr_len": "int64", "file_path": "string", "file_type": "string", "function_names": "string", "gen_bytes": "int64", "gen_data_type": "string", "gen_mode": "string", "gen_size": "int64", "gen_valid": "bool", "height": "int64", "image_file": "string", "image_path": "string", "method_names": "string", "name": "string", "num_all_bases": "int64", "num_bases": "int64", "num_classes": "int64", "num_functions": "float64", "num_imports": "int64", "num_methods": "float64", "prompts": "string", "raises": "string", "raises_len": "float64", "recsize": "int64", "repo": "string", "returns": "string", "returns_len": "float64", "size": "int64", "src_object": "string", "total_objects": "int64", "usage": "string", "usages": "string", "width": "int64" }



