five

ChrisHayduk/Llama-2-SQL-and-Code-Dataset

收藏
Hugging Face2023-09-29 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/ChrisHayduk/Llama-2-SQL-and-Code-Dataset
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: instruction dtype: string - name: input dtype: string - name: output dtype: string - name: table dtype: string splits: - name: train num_bytes: 46640417 num_examples: 128351 - name: eval num_bytes: 1756894 num_examples: 1302 download_size: 18298063 dataset_size: 48397311 configs: - config_name: default data_files: - split: train path: data/train-* - split: eval path: data/eval-* --- # Dataset Card for "Llama-2-SQL-and-Code-Dataset" This dataset is intended to provide LLaMA 2 improved coding and instruction following capabilities, with a specific focus on SQL generation. The dataset is in Alpaca Instruct format. Please be sure to provide the instruction and input in the prompt to the model, along with any prompt text you would like to place around those inputs. In the train split, please ignore the table column. The eval split provides example tables so that the actual executable SQL performance can be compared on a number of SQL generation tasks. To use the tables, they can be loaded as JSON objects and passed to a SQL execution tool such as sqlglot.
提供机构:
ChrisHayduk
原始信息汇总

数据集概述

数据集名称

  • Llama-2-SQL-and-Code-Dataset

数据集目的

  • 提供LLaMA 2模型的改进编码和指令遵循能力,特别关注SQL生成。

数据集格式

  • Alpaca Instruct格式

数据集特征

  • instruction (字符串类型)
  • input (字符串类型)
  • output (字符串类型)
  • table (字符串类型)

数据集分割

  • train
    • 示例数量: 128351
    • 字节数: 46640417
  • eval
    • 示例数量: 1302
    • 字节数: 1756894

数据集大小

  • 下载大小: 18298063字节
  • 数据集大小: 48397311字节

数据文件配置

  • default配置
    • train数据文件路径: data/train-*
    • eval数据文件路径: data/eval-*
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作