hoadm/vispider
收藏Hugging Face2026-04-22 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/hoadm/vispider
下载链接
链接失效反馈官方服务:
资源简介:
ViSpider是Spider文本到SQL基准(Yu等人,EMNLP 2018)的越南语翻译版本。该数据集用于表格问答和文本生成任务,特别是针对越南语的文本到SQL转换。数据集包含8,659个训练样本、1,034个开发样本和2,147个测试样本,总计11,840个样本。每个样本包括唯一标识符、数据库标识符、原始英文问题、黄金SQL查询、SQL复杂度类别、越南语问题翻译及翻译方法。翻译方法包括人工翻译(1,299个样本)、GPT少样本翻译(2,165个样本)和基于Qwen2.5-7B的微调翻译(5,195个样本)。
ViSpider is a Vietnamese translation of the Spider Text-to-SQL benchmark (Yu et al., EMNLP 2018). The dataset is designed for table-question-answering and text-generation tasks, specifically focusing on text-to-SQL conversion in Vietnamese. It includes 8,659 training samples, 1,034 development samples, and 2,147 test samples, totaling 11,840 samples. Each sample contains a unique identifier, database identifier, original English question, gold SQL query, SQL complexity class, Vietnamese question translation, and translation method. Translation methods include human translation (1,299 samples), GPT few-shot translation (2,165 samples), and fine-tuned Qwen2.5-7B translation (5,195 samples).
提供机构:
hoadm



