five

TeeA/ViText2SQL_CoT_ChartGPT

收藏
Hugging Face2024-05-15 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/TeeA/ViText2SQL_CoT_ChartGPT
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: default features: - name: db_id dtype: string - name: query dtype: string - name: query_toks sequence: string - name: query_toks_no_value sequence: string - name: question dtype: string - name: question_toks sequence: string - name: sql dtype: string - name: schema dtype: string - name: gemini_response dtype: string - name: chatgpt_response dtype: string - name: chatgpt_cot dtype: string splits: - name: train num_bytes: 28103144 num_examples: 6831 - name: validation num_bytes: 3420032 num_examples: 954 - name: test num_bytes: 4680728 num_examples: 1908 download_size: 5441382 dataset_size: 36203904 - config_name: word-level features: - name: db_id dtype: string - name: query dtype: string - name: query_toks sequence: string - name: query_toks_no_value sequence: string - name: question dtype: string - name: question_toks sequence: string - name: sql dtype: string - name: schema dtype: string - name: chatgpt_response dtype: string - name: chatgpt_cot dtype: string splits: - name: train num_bytes: 27997364 num_examples: 6831 - name: validation num_bytes: 3405221 num_examples: 954 - name: test num_bytes: 4651874 num_examples: 1908 download_size: 5456923 dataset_size: 36054459 configs: - config_name: default data_files: - split: train path: data/train-* - split: validation path: data/validation-* - split: test path: data/test-* - config_name: word-level data_files: - split: train path: word-level/train-* - split: validation path: word-level/validation-* - split: test path: word-level/test-* ---

The dataset includes two configurations: default and word-level. Each configuration contains features such as database ID, query, query tokens, query tokens without values, question, question tokens, SQL, schema, Gemini response, ChatGPT response, and ChatGPTs CoT (Chain of Thought). The dataset is divided into training, validation, and test sets, each with corresponding sizes and number of examples. The default configuration has 6831 examples in the training set, 954 examples in the validation set, and 1908 examples in the test set. The word-level configuration has the same number of examples in the training, validation, and test sets as the default configuration, but with different file paths.
提供机构:
TeeA
原始信息汇总

数据集概述

配置名称:default

  • 特征信息:

    • db_id: 字符串
    • query: 字符串
    • query_toks: 字符串序列
    • query_toks_no_value: 字符串序列
    • question: 字符串
    • question_toks: 字符串序列
    • sql: 字符串
    • schema: 字符串
    • gemini_response: 字符串
    • chatgpt_response: 字符串
    • chatgpt_cot: 字符串
  • 数据分割:

    • 训练集:6831个样本,28103144字节
    • 验证集:954个样本,3420032字节
    • 测试集:1908个样本,4680728字节
  • 下载大小: 5441382字节

  • 数据集大小: 36203904字节

配置名称:word-level

  • 特征信息:

    • db_id: 字符串
    • query: 字符串
    • query_toks: 字符串序列
    • query_toks_no_value: 字符串序列
    • question: 字符串
    • question_toks: 字符串序列
    • sql: 字符串
    • schema: 字符串
    • chatgpt_response: 字符串
    • chatgpt_cot: 字符串
  • 数据分割:

    • 训练集:6831个样本,27997364字节
    • 验证集:954个样本,3405221字节
    • 测试集:1908个样本,4651874字节
  • 下载大小: 5456923字节

  • 数据集大小: 36054459字节

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作