lleticiasilvaa/RAG_examples
收藏Hugging Face2024-05-17 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/lleticiasilvaa/RAG_examples
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: db_id
dtype: string
- name: schema
dtype: string
- name: schemaComEx
dtype: string
- name: question
dtype: string
- name: query
dtype: string
- name: answer
dtype: string
- name: complexity
dtype: string
- name: distinct
dtype: bool
- name: like
dtype: bool
- name: between
dtype: bool
- name: order_by
dtype: bool
- name: limit
dtype: bool
- name: grouby_by
dtype: bool
- name: having
dtype: bool
- name: single_join
dtype: bool
- name: multiple_join
dtype: bool
- name: multiple_select
dtype: bool
- name: intersect
dtype: bool
- name: except
dtype: bool
- name: union
dtype: bool
splits:
- name: train
num_bytes: 8438563.958533753
num_examples: 1746
- name: test
num_bytes: 43655293
num_examples: 8828
download_size: 1966408
dataset_size: 52093856.95853375
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: test
path: data/test-*
---
This dataset is primarily used for natural language processing tasks related to SQL queries. It includes various features related to SQL queries, such as database ID, database schema, schema complexity description, natural language questions, corresponding SQL queries, query results, and query complexity. Additionally, the dataset records whether specific SQL operations are included in the queries, such as the use of DISTINCT, LIKE, BETWEEN, ORDER BY, LIMIT, GROUP BY, HAVING, single table join, multiple table join, multiple select, INTERSECT, EXCEPT, and UNION. The dataset is divided into training and test sets, which are used for model training and performance evaluation, respectively.
提供机构:
lleticiasilvaa
原始信息汇总
数据集概述
数据集特征
- db_id: 字符串类型
- schema: 字符串类型
- schemaComEx: 字符串类型
- question: 字符串类型
- query: 字符串类型
- answer: 字符串类型
- complexity: 字符串类型
- distinct: 布尔类型
- like: 布尔类型
- between: 布尔类型
- order_by: 布尔类型
- limit: 布尔类型
- grouby_by: 布尔类型
- having: 布尔类型
- single_join: 布尔类型
- multiple_join: 布尔类型
- multiple_select: 布尔类型
- intersect: 布尔类型
- except: 布尔类型
- union: 布尔类型
数据集分割
- 训练集:
- 大小: 8438563.958533753 字节
- 示例数: 1746
- 测试集:
- 大小: 43655293 字节
- 示例数: 8828
数据集大小
- 下载大小: 1966408 字节
- 数据集总大小: 52093856.95853375 字节
配置文件
- 默认配置:
- 训练数据路径:
data/train-* - 测试数据路径:
data/test-*
- 训练数据路径:



