ChrisHayduk/Llama-2-SQL-and-Code-Dataset
收藏Hugging Face2023-09-29 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/ChrisHayduk/Llama-2-SQL-and-Code-Dataset
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: instruction
dtype: string
- name: input
dtype: string
- name: output
dtype: string
- name: table
dtype: string
splits:
- name: train
num_bytes: 46640417
num_examples: 128351
- name: eval
num_bytes: 1756894
num_examples: 1302
download_size: 18298063
dataset_size: 48397311
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: eval
path: data/eval-*
---
# Dataset Card for "Llama-2-SQL-and-Code-Dataset"
This dataset is intended to provide LLaMA 2 improved coding and instruction following capabilities, with a specific focus on SQL generation.
The dataset is in Alpaca Instruct format. Please be sure to provide the instruction and input in the prompt to the model, along with any prompt text you would like to place around those inputs.
In the train split, please ignore the table column. The eval split provides example tables so that the actual executable SQL performance can be compared on a number of SQL generation tasks.
To use the tables, they can be loaded as JSON objects and passed to a SQL execution tool such as sqlglot.
提供机构:
ChrisHayduk
原始信息汇总
数据集概述
数据集名称
- Llama-2-SQL-and-Code-Dataset
数据集目的
- 提供LLaMA 2模型的改进编码和指令遵循能力,特别关注SQL生成。
数据集格式
- Alpaca Instruct格式
数据集特征
- instruction (字符串类型)
- input (字符串类型)
- output (字符串类型)
- table (字符串类型)
数据集分割
- train
- 示例数量: 128351
- 字节数: 46640417
- eval
- 示例数量: 1302
- 字节数: 1756894
数据集大小
- 下载大小: 18298063字节
- 数据集大小: 48397311字节
数据文件配置
- default配置
- train数据文件路径: data/train-*
- eval数据文件路径: data/eval-*



