AmanPriyanshu/reasoning-sft-synthetic_text_to_sql-128K
收藏Hugging Face2026-03-03 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/AmanPriyanshu/reasoning-sft-synthetic_text_to_sql-128K
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- question-answering
- table-question-answering
- text-generation
language:
- en
tags:
- synthetic
- SQL
- text-to-SQL
- code
- datadesigner
- reasoning
- sft
- chain-of-thought
size_categories:
- 100K<n<1M
---
# synthetic_text_to_sql (converted)
Converted version of [gretelai/synthetic_text_to_sql](https://huggingface.co/datasets/gretelai/synthetic_text_to_sql), reformatted to 100,000 rows for reasoning SFT training.
## Format
Each row has three columns:
- **`input`** — list of dicts `[{"role": "system", "content": "..."}, {"role": "user", "content": "..."}]` (system prompt contains the database schema, user prompt contains the natural language question)
- **`response`** — response string with `<think>` reasoning block (SQL explanation) followed by the SQL query
- **`domain`** — SQL task type (e.g. analytics and reporting, data manipulation, etc.)
## Domains
| Domain | Rows |
|--------|------|
| analytics and reporting | 88,186 |
| data manipulation | 9,665 |
| data retrieval | 1,309 |
| data definition | 840 |
## Conversion
- System prompt wraps `sql_context` (CREATE TABLE + INSERT statements) in a SQL expert template
- User prompt is the natural language `sql_prompt`
- Reasoning is `sql_explanation`, answer is the `sql` query
- Response format: `<think>\n{explanation}\n</think>\n{sql}`
- 100% conversion rate (no rows dropped)
## License
Apache 2.0
## Credits
Original dataset: [gretelai/synthetic_text_to_sql](https://huggingface.co/datasets/gretelai/synthetic_text_to_sql) by Gretel AI
提供机构:
AmanPriyanshu



