SQL-API-Bench

Name: SQL-API-Bench
Creator: maas
Published: 2025-12-05 11:38:58
License: 暂无描述

魔搭社区2025-12-05 更新2025-11-03 收录

下载链接：

https://modelscope.cn/datasets/ibm-research/SQL-API-Bench

下载链接

链接失效反馈

官方服务：

资源简介：

# Dataset Card for Dataset Name  This dataset contains QA that requires DB and API access at the same time. It is composed of two new benchmarks consisting of questions whose answers require a combination of database and API calls, both of which are augmentations of the popular Spider dataset and benchmark. Benchmark I replaces a fraction of the real Spider database tables with equivalents that are executed via APIs. This allows us to directly test the mechanism by which database and API calls are combined without having to change the questions or their ground-truth answers from the original Spider benchmark. Benchmark II introduces a new set of scalar APIs that perform simple lexical, numeric, or geo-spatial operations. From a subset of two dozen Spider databases, we transform questions from the original Spider database into new questions that require interleaving database operations with compositions of 1-3 scalar APIs. We establish a set of corresponding ground-truth answers through a semi-automated process that generates over 2300 human-vetted question/answer pairs.                                                     ## Dataset Card Contact ekhabiri@us.ibm.com

# 数据集卡片（Dataset Card）：数据集名称  本数据集包含需同时调用数据库（Database，DB）与应用程序编程接口（Application Programming Interface，API）的问答（Question Answering，QA）样本。其由两项全新基准测试构成，所有测试问题的答案均需结合数据库操作与API调用完成，且两项基准测试均基于热门的Spider数据集与基准测试拓展而来。基准测试I（Benchmark I）将部分真实的Spider数据库表替换为通过API执行的等效表。此举使我们能够直接测试数据库与API调用的结合机制，且无需修改原始Spider基准测试中的问题及其标准答案（ground-truth answers）。基准测试II（Benchmark II）引入了一组全新的标量API（scalar APIs），可执行简单的词汇、数值或地理空间（geo-spatial）操作。我们从24个Spider数据库的子集中，将原始Spider数据库中的问题转换为新问题，这些新问题需要将数据库操作与1至3个标量API的组合调用交错进行。我们通过半自动化流程生成了超过2300个人工审核的问答对，并据此构建了对应的标准答案集。                                                     ## 数据集卡片联系人 ekhabiri@us.ibm.com

提供机构：

maas

创建时间：

2025-10-04

5,000+

优质数据集

54 个

任务类型

进入经典数据集