five

TRIAGESQL

收藏
arXiv2020-10-24 更新2024-06-21 收录
下载链接:
https://github.com/chatc/TriageSQL
下载链接
链接失效反馈
官方服务:
资源简介:
TRIAGESQL是由埃默里大学等机构创建的第一个跨领域文本到SQL问题意图分类基准数据集,包含39万条问题与数据库对。该数据集从20个现有数据集中构建,经过修订和标注,形成高质量的测试集。数据集旨在帮助文本到SQL系统处理不同类型的输入,特别是区分可回答和不可回答的问题。其应用领域包括改进文本到SQL系统的准确性和鲁棒性,解决实际场景中用户输入可能无法通过SQL查询回答的问题。

TRIAGESQL is the first cross-domain text-to-SQL question intent classification benchmark dataset developed by institutions including Emory University and others, containing 390,000 question-database pairs. This dataset is constructed from 20 existing datasets, and has been revised and annotated to form a high-quality test set. The dataset aims to assist text-to-SQL systems in handling various types of inputs, particularly to distinguish between answerable and unanswerable questions. Its application areas include improving the accuracy and robustness of text-to-SQL systems, and addressing the issue that user inputs in real-world scenarios may not be answerable via SQL queries.
提供机构:
埃默里大学
创建时间:
2020-10-24
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作