TrustSQL
收藏arXiv2024-04-16 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2403.15879v2
下载链接
链接失效反馈官方服务:
资源简介:
TrustSQL是由韩国科学技术院人工智能研究所开发的用于评估文本到SQL模型可靠性的基准数据集。该数据集包含20083条数据,涵盖了可回答和不可回答的问题,旨在全面评估模型在单一数据库和跨数据库设置下的性能。TrustSQL要求模型提供SQL预测或选择不进行预测,以避免潜在的错误或处理不可回答的问题。数据集通过引入新的可靠性评分机制,奖励准确的SQL预测和正确识别不可回答的问题,同时惩罚错误的SQL预测和尝试为不可回答问题生成SQL的行为。
TrustSQL is a benchmark dataset developed by the AI Institute of the Korea Advanced Institute of Science and Technology (KAIST) for evaluating the reliability of text-to-SQL models. This dataset contains 20,083 instances, covering both answerable and unanswerable questions, and aims to comprehensively evaluate model performance under both single-database and cross-database settings. TrustSQL mandates that models either provide SQL predictions or choose not to make predictions, so as to avoid potential errors or properly handle unanswerable questions. The dataset introduces a novel reliability scoring mechanism that rewards accurate SQL predictions and correct identification of unanswerable questions, while penalizing erroneous SQL predictions and attempts to generate SQL for unanswerable questions.
提供机构:
韩国科学技术院人工智能研究所
创建时间:
2024-03-24



