agentlans/sql-text-collection
收藏Hugging Face2025-03-29 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/agentlans/sql-text-collection
下载链接
链接失效反馈官方服务:
资源简介:
这是一个文本到SQL的数据集集合,包含了多个公开可用的文本到SQL数据集。每个数据集的每一行都包括数据库模式(例如CREATE TABLE语句)、自然语言查询或执行的操作(用英语表达)、SQL查询和原始数据集来源。该数据集经过合并、SQL语句格式化、去除重复和空行以及分层抽样分成训练集和测试集的处理。适用于训练文本到SQL模型、跨不同SQL查询和域的模型性能基准测试以及语义解析和跨域泛化的研究。
This is a collection of text-to-SQL datasets, which includes multiple publicly available datasets. Each row of the datasets contains the database schema (e.g., CREATE TABLE statements), a natural language query or action to perform (expressed in English), an SQL query, and the source of the original dataset. The dataset has been processed by merging, formatting SQL statements, removing duplicates and blank rows, and splitting into training and testing sets using stratified sampling. It is suitable for training text-to-SQL models, benchmarking model performance across diverse SQL queries and domains, and researching semantic parsing and cross-domain generalization.
提供机构:
agentlans



