dki-lab/grail_qa

Name: dki-lab/grail_qa
Creator: dki-lab
Published: 2024-01-18 11:04:25
License: 暂无描述

Hugging Face2024-01-18 更新2024-06-15 收录

下载链接：

https://hf-mirror.com/datasets/dki-lab/grail_qa

下载链接

链接失效反馈

官方服务：

资源简介：

--- annotations_creators: - crowdsourced language_creators: - found language: - en license: - unknown multilinguality: - monolingual size_categories: - 10K<n<100K source_datasets: - original task_categories: - question-answering task_ids: [] paperswithcode_id: null pretty_name: Grail QA tags: - knowledge-base-qa dataset_info: features: - name: qid dtype: string - name: question dtype: string - name: answer sequence: - name: answer_type dtype: string - name: answer_argument dtype: string - name: entity_name dtype: string - name: function dtype: string - name: num_node dtype: int32 - name: num_edge dtype: int32 - name: graph_query struct: - name: nodes sequence: - name: nid dtype: int32 - name: node_type dtype: string - name: id dtype: string - name: class dtype: string - name: friendly_name dtype: string - name: question_node dtype: int32 - name: function dtype: string - name: edges sequence: - name: start dtype: int32 - name: end dtype: int32 - name: relation dtype: string - name: friendly_name dtype: string - name: sparql_query dtype: string - name: domains sequence: string - name: level dtype: string - name: s_expression dtype: string splits: - name: train num_bytes: 69433121 num_examples: 44337 - name: validation num_bytes: 9800544 num_examples: 6763 - name: test num_bytes: 2167256 num_examples: 13231 download_size: 17636773 dataset_size: 81400921 --- # Dataset Card for Grail QA ## Table of Contents - [Dataset Description](#dataset-description) - [Dataset Summary](#dataset-summary) - [Supported Tasks and Leaderboards](#supported-tasks-and-leaderboards) - [Languages](#languages) - [Dataset Structure](#dataset-structure) - [Data Instances](#data-instances) - [Data Fields](#data-fields) - [Data Splits](#data-splits) - [Dataset Creation](#dataset-creation) - [Curation Rationale](#curation-rationale) - [Source Data](#source-data) - [Annotations](#annotations) - [Personal and Sensitive Information](#personal-and-sensitive-information) - [Considerations for Using the Data](#considerations-for-using-the-data) - [Social Impact of Dataset](#social-impact-of-dataset) - [Discussion of Biases](#discussion-of-biases) - [Other Known Limitations](#other-known-limitations) - [Additional Information](#additional-information) - [Dataset Curators](#dataset-curators) - [Licensing Information](#licensing-information) - [Citation Information](#citation-information) - [Contributions](#contributions) ## Dataset Description - **Homepage:** [Grail QA](https://dki-lab.github.io/GrailQA/) - **Repository:** - **Paper:** [GrailQA paper (Gu et al. '20)](https://arxiv.org/abs/2011.07743) - **Leaderboard:** - **Point of Contact:** ### Dataset Summary #### What is GrailQA? Strongly Generalizable Question Answering (GrailQA) is a new large-scale, high-quality dataset for question answering on knowledge bases (KBQA) on Freebase with 64,331 questions annotated with both answers and corresponding logical forms in different syntax (i.e., SPARQL, S-expression, etc.). It can be used to test three levels of generalization in KBQA: i.i.d., compositional, and zero-shot. #### Why GrailQA? GrailQA is by far the largest crowdsourced KBQA dataset with questions of high diversity (i.e., questions in GrailQA can have up to 4 relations and optionally have a function from counting, superlatives and comparatives). It also has the highest coverage over Freebase; it widely covers 3,720 relations and 86 domains from Freebase. Last but not least, our meticulous data split allows GrailQA to test not only i.i.d. generalization, but also compositional generalization and zero-shot generalization, which are critical for practical KBQA systems. ### Supported Tasks and Leaderboards [More Information Needed] ### Languages English and Graph query ## Dataset Structure ### Data Instances [More Information Needed] ### Data Fields - `qid` (`str`) - `question` (`str`) - `answer` (`List`): Defaults to `[]` in test split. - `answer_type` (`str`) - `answer_argument` (`str`) - `entity_name` (`str`): Defauts to `""` if `answer_type` is not `Entity`. - `function` (`string`): Defaults to `""` in test split. - `num_node` (`int`): Defaults to `-1` in test split. - `num_edge` (`int`): Defaults to `-1` in test split. - `graph_query` (`Dict`) - `nodes` (`List`): Defaults to `[]` in test split. - `nid` (`int`) - `node_type` (`str`) - `id` (`str`) - `class` (`str`) - `friendly_name` (`str`) - `question_node` (`int`) - `function` (`str`) - `edges` (`List`): Defaults to `[]` in test split. - `start` (`int`) - `end` (`int`) - `relation` (`str`) - `friendly_name` (`str`) - `sqarql_query` (`str`): Defaults to `""` in test split. - `domains` (`List[str]`): Defaults to `[]` in test split. - `level` (`str`): Only available in validation split. Defaults to `""` in others. - `s_expression` (`str`): Defaults to `""` in test split. **Notes:** Only `qid` and `question` available in test split. ### Data Splits Dataset Split | Number of Instances in Split --------------|-------------------------------------------- Train | 44,337 Validation | 6,763 Test | 13,231 ## Dataset Creation ### Curation Rationale [More Information Needed] ### Source Data #### Initial Data Collection and Normalization [More Information Needed] #### Who are the source language producers? [More Information Needed] ### Annotations #### Annotation process [More Information Needed] #### Who are the annotators? [More Information Needed] ### Personal and Sensitive Information [More Information Needed] ## Considerations for Using the Data ### Social Impact of Dataset [More Information Needed] ### Discussion of Biases [More Information Needed] ### Other Known Limitations [More Information Needed] ## Additional Information ### Dataset Curators [More Information Needed] ### Licensing Information [More Information Needed] ### Citation Information [More Information Needed] ### Contributions Thanks to [@mattbui](https://github.com/mattbui) for adding this dataset.

annotations_creators: - 众包（crowdsourced） language_creators: - 现有采集（found） language: - 英语（en） license: - 未知（unknown） multilinguality: - 单语言（monolingual） size_categories: - 10K<n<100K source_datasets: - 原创（original） task_categories: - 问答（question-answering） task_ids: [] paperswithcode_id: 无 pretty_name: Grail QA tags: - 知识库问答（knowledge-base-qa） dataset_info: features: - name: qid dtype: string - name: question dtype: string - name: answer sequence: - name: answer_type dtype: string - name: answer_argument dtype: string - name: entity_name dtype: string - name: function dtype: string - name: num_node dtype: int32 - name: num_edge dtype: int32 - name: graph_query struct: - name: nodes sequence: - name: nid dtype: int32 - name: node_type dtype: string - name: id dtype: string - name: class dtype: string - name: friendly_name dtype: string - name: question_node dtype: int32 - name: function dtype: string - name: edges sequence: - name: start dtype: int32 - name: end dtype: int32 - name: relation dtype: string - name: friendly_name dtype: string - name: sparql_query dtype: string - name: domains sequence: string - name: level dtype: string - name: s_expression dtype: string splits: - name: train num_bytes: 69433121 num_examples: 44337 - name: validation num_bytes: 9800544 num_examples: 6763 - name: test num_bytes: 2167256 num_examples: 13231 download_size: 17636773 dataset_size: 81400921 # Grail QA 数据集卡片 ## 目录 - [数据集概述](#dataset-description) - [数据集总结](#dataset-summary) - [支持任务与基准测试榜单](#supported-tasks-and-leaderboards) - [语言](#languages) - [数据集结构](#dataset-structure) - [数据实例](#data-instances) - [数据字段](#data-fields) - [数据划分](#data-splits) - [数据集构建](#dataset-creation) - [构建理念](#curation-rationale) - [源数据](#source-data) - [注释](#annotations) - [个人与敏感信息](#personal-and-sensitive-information) - [数据集使用注意事项](#considerations-for-using-the-data) - [数据集的社会影响](#social-impact-of-dataset) - [偏差讨论](#discussion-of-biases) - [其他已知局限](#other-known-limitations) - [附加信息](#additional-information) - [数据集维护者](#dataset-curators) - [许可信息](#licensing-information) - [引用信息](#citation-information) - [贡献](#contributions) ## 数据集概述 - **主页:** [Grail QA](https://dki-lab.github.io/GrailQA/) - **代码仓库:** - **论文:** [GrailQA 论文 (Gu等人，2020年)](https://arxiv.org/abs/2011.07743) - **基准测试榜单:** - **联系人:** ### 数据集总结 #### 什么是Grail QA？强泛化问答（GrailQA，Strongly Generalizable Question Answering）是一款大规模、高质量的知识库问答（KBQA, Knowledge Base Question Answering）数据集，基于Freebase构建，共包含64331条标注了答案与多种语法形式逻辑表达式的问题，如SPARQL、S表达式（s-expression）等。该数据集可用于测试KBQA中的三类泛化能力：独立同分布（i.i.d.）泛化、组合泛化与零样本（zero-shot）泛化。 #### 为何选择Grail QA？ Grail QA是目前规模最大的众包式KBQA数据集，其问题具备高度多样性——单条问题最多可包含4种关系，且可按需包含计数、最高级与比较级相关的函数。此外，该数据集对Freebase的覆盖度位居同类数据集之首，广泛覆盖了Freebase中的3720种关系与86个领域。尤为关键的是，我们精心设计的数据划分方式不仅支持独立同分布泛化测试，还可用于组合泛化与零样本泛化测试，这对实际KBQA系统的研发具有重要价值。 ### 支持任务与基准测试榜单 [需补充更多信息] ### 语言英语与图查询语言 ## 数据集结构 ### 数据实例 [需补充更多信息] ### 数据字段 - `qid` (`str`)：问题唯一标识符 - `question` (`str`)：自然语言问题文本 - `answer` (`List`)：答案列表，测试划分下默认为`[]` - `answer_type` (`str`)：答案类型 - `answer_argument` (`str`)：答案参数 - `entity_name` (`str`)：实体名称，若`answer_type`不为`"Entity"`则默认为`""` - `function` (`string`)：查询函数信息，测试划分下默认为`""` - `num_node` (`int`)：查询图节点数量，测试划分下默认为`-1` - `num_edge` (`int`)：查询图边数量，测试划分下默认为`-1` - `graph_query` (`Dict`)：图查询结构体 - `nodes` (`List`)：节点列表，测试划分下默认为`[]` - `nid` (`int`)：节点ID - `node_type` (`str`)：节点类型 - `id` (`str`)：节点唯一标识符 - `class` (`str`)：节点所属类别 - `friendly_name` (`str`)：节点友好名称 - `question_node` (`int`)：问题节点标记 - `function` (`str`)：节点相关函数 - `edges` (`List`)：边列表，测试划分下默认为`[]` - `start` (`int`)：边起始节点ID - `end` (`int`)：边终止节点ID - `relation` (`str`)：关系名称 - `friendly_name` (`str`)：关系友好名称 - `sparql_query` (`str`)：SPARQL查询语句，测试划分下默认为`""`（原文字段名为sqarql_query，为拼写修正） - `domains` (`List[str]`)：问题所属领域列表，测试划分下默认为`[]` - `level` (`str`)：问题难度等级，仅验证划分下可用，其他划分下默认为`""` - `s_expression` (`str`)：S表达式查询语句，测试划分下默认为`""` **备注：** 测试划分下仅包含`qid`与`question`两个字段。 ### 数据划分 | 数据集划分 | 划分内示例数量 | | ---------- | -------------- | | 训练集 | 44337 | | 验证集 | 6763 | | 测试集 | 13231 | ## 数据集构建 ### 构建理念 [需补充更多信息] ### 源数据 #### 初始数据收集与标准化 [需补充更多信息] #### 源语言生成者是谁？ [需补充更多信息] ### 注释 #### 注释流程 [需补充更多信息] #### 注释者是谁？ [需补充更多信息] ### 个人与敏感信息 [需补充更多信息] ## 数据集使用注意事项 ### 数据集的社会影响 [需补充更多信息] ### 偏差讨论 [需补充更多信息] ### 其他已知局限 [需补充更多信息] ## 附加信息 ### 数据集维护者 [需补充更多信息] ### 许可信息 [需补充更多信息] ### 引用信息 [需补充更多信息] ### 贡献感谢 [@mattbui](https://github.com/mattbui) 贡献此数据集。

提供机构：

dki-lab

原始信息汇总

数据集概述

数据集基本信息

数据集名称: Grail QA
语言: 英语
许可: 未知
多语言性: 单语种
数据集大小: 10K<n<100K
源数据: 原始数据
任务类别: 问答
标签: 知识库问答

数据集结构

特征

qid: 字符串类型
question: 字符串类型
answer: 序列类型
- answer_type: 字符串类型
- answer_argument: 字符串类型
- entity_name: 字符串类型
function: 字符串类型
num_node: 32位整数类型
num_edge: 32位整数类型
graph_query: 结构类型
- nodes: 序列类型
  - nid: 32位整数类型
  - node_type: 字符串类型
  - id: 字符串类型
  - class: 字符串类型
  - friendly_name: 字符串类型
  - question_node: 32位整数类型
  - function: 字符串类型
- edges: 序列类型
  - start: 32位整数类型
  - end: 32位整数类型
  - relation: 字符串类型
  - friendly_name: 字符串类型
sparql_query: 字符串类型
domains: 字符串序列类型
level: 字符串类型
s_expression: 字符串类型

数据分割

训练集: 44,337个实例
验证集: 6,763个实例
测试集: 13,231个实例

数据集创建

数据集摘要

Grail QA是一个用于知识库问答（KBQA）的新型大规模高质量数据集，包含64,331个问题，这些问题在Freebase上进行了标注，并提供了相应的答案和逻辑形式（如SPARQL、S-expression等）。该数据集可用于测试KBQA的三个级别的泛化能力：独立同分布、组合泛化和零样本泛化。

数据集特点

Grail QA是目前最大的众包KBQA数据集，问题具有高度的多样性（例如，Grail QA中的问题最多可以包含4个关系，并可以选择性地包含计数、最高级和比较功能）。
它对Freebase的覆盖率最高，广泛涵盖了Freebase中的3,720个关系和86个领域。
精心设计的数据分割允许Grail QA不仅测试独立同分布的泛化能力，还测试组合泛化和零样本泛化能力，这对实际的KBQA系统至关重要。

搜集汇总

数据集介绍

构建方式

Grail QA数据集通过众包方式构建，涵盖了64,331个高质量的问题，这些问题均与知识库Freebase相关。每个问题不仅标注了答案，还提供了相应的逻辑形式，包括SPARQL和S-expression等。数据集的构建旨在测试知识库问答系统在不同泛化层次上的表现，包括独立同分布、组合泛化和零样本泛化。

特点

Grail QA数据集具有显著的特点，包括问题的高多样性和广泛覆盖Freebase的3,720个关系和86个领域。此外，数据集的精心划分使其能够测试不同泛化能力，如独立同分布、组合泛化和零样本泛化，这对于实际的知识库问答系统至关重要。

使用方法

Grail QA数据集可用于训练和评估知识库问答系统。用户可以通过提供的训练、验证和测试集进行模型训练和性能评估。数据集的结构包括问题ID、问题文本、答案、逻辑查询形式等字段，便于用户进行深入分析和模型开发。

背景与挑战

背景概述

Grail QA数据集是由DKI-Lab团队创建的，旨在推动知识库问答（KBQA）领域的发展。该数据集于2020年发布，包含64,331个高质量的问答对，涵盖了Freebase知识库中的3,720个关系和86个领域。Grail QA不仅提供了丰富的问答数据，还包含了多种逻辑形式的查询语句，如SPARQL和S-表达式，使其能够测试KBQA系统在独立同分布（i.i.d.）、组合泛化（compositional generalization）和零样本泛化（zero-shot generalization）三个层次上的表现。该数据集的发布对KBQA领域的研究具有重要意义，尤其是在测试系统泛化能力方面。

当前挑战

Grail QA数据集在构建过程中面临了多重挑战。首先，如何确保问答对的多样性和高质量是一个关键问题，因为这直接影响到数据集的实用性和研究价值。其次，数据集的标注过程需要处理复杂的逻辑形式，如SPARQL和S-表达式，这对标注者的专业知识提出了较高要求。此外，数据集的划分需要精心设计，以确保能够有效测试系统的不同泛化能力，如组合泛化和零样本泛化。最后，如何处理和避免数据集中的潜在偏见，以及确保数据集的广泛适用性，也是构建过程中需要解决的重要问题。

常用场景

经典使用场景

Grail QA数据集在知识库问答（KBQA）领域中具有广泛的应用，尤其适用于测试问答系统在不同泛化层次上的表现。该数据集通过提供大量多样化的问答对及其对应的逻辑形式（如SPARQL、S-expression等），能够有效评估模型在独立同分布（i.i.d.）、组合泛化（compositional generalization）和零样本泛化（zero-shot generalization）等场景下的性能。

实际应用

在实际应用中，Grail QA数据集可用于开发和优化面向知识库的智能问答系统，广泛应用于搜索引擎、智能客服、知识图谱查询等领域。通过训练和测试基于该数据集的模型，企业能够构建更加智能和灵活的问答系统，提升用户体验并加速知识获取过程。

衍生相关工作

Grail QA数据集的发布激发了大量相关研究工作，特别是在知识库问答系统的泛化能力评估和改进方面。许多研究者基于该数据集提出了新的模型和方法，以应对复杂查询和跨领域迁移等挑战。此外，该数据集还促进了知识库问答领域中关于数据分割策略和泛化测试标准的讨论和探索。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集