YF0808/tot-cwq-plan-sft-outputs34-rule-full-pw4-expand-labels-v2
收藏Hugging Face2026-04-24 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/YF0808/tot-cwq-plan-sft-outputs34-rule-full-pw4-expand-labels-v2
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个针对知识图谱问答(KGQA)任务的数据集,特别设计用于处理Freebase和复杂网络问题(CWQ)。数据集通过本地运行`outputs34_rule_full_pw4_expand_labels_v2`生成,用于监督微调(SFT)任务。其特点包括基于规则的关系分组、扩展序列标签、使用4个并行工作器,并启用了严格的扩展对等和嵌套扩展标签。数据集的主要改进在于将嵌套扩展类别以带有标签和计数的形式呈现,而非仅使用类别ID。数据集包含521,450行数据,其中15.8%的问题标记有警告。数据来源于本地Freebase/Virtuoso图后端。
This dataset is a knowledge graph question answering (KGQA) dataset, specifically designed for Freebase and Complex Web Questions (CWQ) tasks. The dataset is generated through a local run `outputs34_rule_full_pw4_expand_labels_v2` and is used for Supervised Fine-Tuning (SFT) tasks. It features rule-based relation grouping, expanded sequence labels, 4 parallel workers, and enables strict expand parity and nested expand labels. The main improvement in this version is the rendering of nested Expand categories with labels and counts, rather than relying solely on category IDs. The dataset contains 521,450 rows, with 15.8% of the questions flagged with warnings. The data source is a local Freebase/Virtuoso graph backend.
提供机构:
YF0808



