birdsql/livesqlbench-base-full-v1
收藏Hugging Face2026-03-11 更新2025-09-13 收录
下载链接:
https://hf-mirror.com/datasets/birdsql/livesqlbench-base-full-v1
下载链接
链接失效反馈官方服务:
资源简介:
LiveSQLBench是一个动态的、无污染的基准测试,用于评估大型语言模型在复杂、真实世界的文本到SQL任务上的性能。它包含多样化的真实世界用户查询,包括商业智能(BI)、CRUD操作等。每个版本都会包括大约20个新的、完全开源的数据库,由BIRD团队通过专家合作和持续改进进行策划。它涵盖了从终端用户级别到工业级别的各种数据库大小,支持SQL查询的全谱系,包括SELECT、CREATE、UPDATE等操作,并提供了自动化评估和实时、隐藏测试的功能。
LiveSQLBench is a dynamic, contamination-free benchmark designed to evaluate large language models on complex, real-world text-to-SQL tasks. It includes diverse real-world user queries, such as Business Intelligence (BI) and CRUD operations. Each release features around 20 new, fully open-source databases curated by the BIRD team through expert collaboration and continuous improvement. It covers a wide range of database sizes from end-user level to industrial level, supports the full SQL spectrum including SELECT, CREATE, UPDATE operations, and provides automated evaluation as well as live and hidden test capabilities.
提供机构:
birdsql



