five

jiangjiechen/ekar_english

收藏
Hugging Face2023-01-11 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/jiangjiechen/ekar_english
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en license: - afl-3.0 size_categories: - 1K<n<2K source_datasets: - original task_categories: - question-answering - text-generation task_ids: - analogical-qa - explanation-generation --- # Dataset Card for ekar_english ## Table of Contents - [Dataset Description](#dataset-description) - [Dataset Summary](#dataset-summary) - [Supported Tasks](#supported-tasks-and-leaderboards) - [Languages](#languages) - [Dataset Structure](#dataset-structure) - [Data Instances](#data-instances) - [Data Fields](#data-instances) - [Data Splits](#data-instances) - [Dataset Creation](#dataset-creation) - [Curation Rationale](#curation-rationale) - [Source Data](#source-data) - [Annotations](#annotations) - [Personal and Sensitive Information](#personal-and-sensitive-information) - [Considerations for Using the Data](#considerations-for-using-the-data) - [Social Impact of Dataset](#social-impact-of-dataset) - [Discussion of Biases](#discussion-of-biases) - [Other Known Limitations](#other-known-limitations) - [Additional Information](#additional-information) - [Dataset Curators](#dataset-curators) - [Licensing Information](#licensing-information) - [Citation Information](#citation-information) ## Dataset Description - **Homepage:** https://ekar-leaderboard.github.io - **Paper:** [E-KAR: A Benchmark for Rationalizing Natural Language Analogical Reasoning](https://aclanthology.org/2022.findings-acl.311) - **Leaderboard:** https://eval.ai/web/challenges/challenge-page/1671/overview - **Point of Contact:** jjchen19@fudan.edu.cn ### Dataset Summary ***New!***(9/18/2022) E-KAR `v1.1` is officially released (at the `main` branch), **with a higher-quality English dataset!** In `v1.1`, we further improve the Chinese-to-English translation quality of the English E-KAR, with over 600 problems and over 1,000 explanations manually adjusted. You can still find previous version (as in the paper) in the `v1.0` branch in the repo. For more information please refer to https://ekar-leaderboard.github.io. The ability to recognize analogies is fundamental to human cognition. Existing benchmarks to test word analogy do not reveal the underneath process of analogical reasoning of neural models. Holding the belief that models capable of reasoning should be right for the right reasons, we propose a first-of-its-kind Explainable Knowledge-intensive Analogical Reasoning benchmark (E-KAR). Our benchmark consists of 1,655 (in Chinese) and 1,251 (in English) problems sourced from the Civil Service Exams, which require intensive background knowledge to solve. More importantly, we design a free-text explanation scheme to explain whether an analogy should be drawn, and manually annotate them for each and every question and candidate answer. Empirical results suggest that this benchmark is very challenging for some state-of-the-art models for both explanation generation and analogical question answering tasks, which invites further research in this area. ### Supported Tasks and Leaderboards - `analogical-qa`: The dataset can be used to train a model for analogical reasoning in the form of multiple-choice QA. - `explanation-generation`: The dataset can be used to generate free-text explanations to rationalize analogical reasoning. This dataset supports two task modes: EASY mode and HARD mode: - `EASY mode`: where query explanation can be used as part of the input. - `HARD mode`: no explanation is allowed as part of the input. ### Languages This dataset is in English, which is translated from [its Chinese version](https://huggingface.co/datasets/Jiangjie/ekar_chinese/) ## Dataset Structure ### Data Instances ```json { "id": "982f17-en", "question": "plant:coal", "choices": { "label": [ "A", "B", "C", "D" ], "text": [ "white wine:aged vinegar", "starch:corn", "milk:yogurt", "pickled cabbage:cabbage" ] }, "answerKey": "C", "explanation": [ "\"plant\" is the raw material of \"coal\".", "both \"white wine\" and \"aged vinegar\" are brewed.", "\"starch\" is made of \"corn\", and the order of words is inconsistent with the query.", "\"yogurt\" is made from \"milk\".", "\"pickled cabbage\" is made of \"cabbage\", and the word order is inconsistent with the query." ], "relation": [ [["plant", "coal", "R3.7"]], [["white wine", "aged vinegar", "R2.4"]], [["corn", "starch", "R3.7"]], [["milk", "yogurt", "R3.7"]], [["cabbage", "pickled cabbage", "R3.7"]] ] } ``` ### Data Fields - id: a string identifier for each example. - question: query terms. - choices: candidate answer terms. - answerKey: correct answer. - explanation: explanations for query (1st) and candidate answers (2nd-5th). - relation: annotated relations for terms in the query (1st) and candidate answers (2nd-5th). ### Data Splits | name |train|validation|test| |:-----:|:---:|:--------:|:--:| |default| 870| 119| 262| |description| | | blinded | ## Dataset Creation ### Curation Rationale [Needs More Information] ### Source Data #### Initial Data Collection and Normalization [Needs More Information] #### Who are the source language producers? [Needs More Information] ### Annotations #### Annotation process [Needs More Information] #### Who are the annotators? [Needs More Information] ### Personal and Sensitive Information [Needs More Information] ## Considerations for Using the Data ### Social Impact of Dataset The purpose of this dataset is to help develop analogical reasoning systems that are right for the right reasons. ### Discussion of Biases This dataset is sourced and translated from the Civil Service Examinations of China. Therefore, despite the effort that the authors try to remove or rewrite such problems, it may still contain information biased to Chinese culture. ### Other Known Limitations 1. The explanation annotation process in E-KAR (not the EG task) is mostly post-hoc and reflects only the result of reasoning. Humans solve the analogy problems in a trial-and-error manner, i.e., adjusting the abduced source structure and trying to find the most suited one for all candidate answers. Therefore, such explanations cannot offer supervision for intermediate reasoning. 2. E-KAR only presents one feasible explanation for each problem, whereas there may be several. 3. The English version of E-KAR is machine-translated and post-edited by humans. Although the authors have tried their best to maintain the translation quality, there could be some unsatisfying samples in the English dataset, e.g., culture-specific ones, ambiguous ones after translation, etc. ## Additional Information ### Dataset Curators The dataset was initially created and curated by Jiangjie Chen (Fudan University, ByteDance), Rui Xu (Fudan University), Ziquan Fu (Brain Technologies, Inc.), Wei Shi (South China University of Technology), Xinbo Zhang (ByteDance), Changzhi Sun (ByteDance) and other colleagues at ByteDance and Fudan University. ### Licensing Information [Needs More Information] ### Citation Information ```latex @inproceedings{chen-etal-2022-e, title = "{E}-{KAR}: A Benchmark for Rationalizing Natural Language Analogical Reasoning", author = "Chen, Jiangjie and Xu, Rui and Fu, Ziquan and Shi, Wei and Li, Zhongqiao and Zhang, Xinbo and Sun, Changzhi and Li, Lei and Xiao, Yanghua and Zhou, Hao", booktitle = "Findings of the Association for Computational Linguistics: ACL 2022", month = may, year = "2022", address = "Dublin, Ireland", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2022.findings-acl.311", pages = "3941--3955", } ```
提供机构:
jiangjiechen
原始信息汇总

数据集概述

数据集名称

  • 名称: ekar_english

数据集摘要

  • 版本: v1.1
  • 描述: 该数据集包含1,251个问题,源自中国的公务员考试,涉及需要深入背景知识解决的问题。数据集支持解释生成和类比问答任务,旨在测试和提升模型在自然语言类比推理方面的能力。

支持的任务和排行榜

  • 任务1: 类比问答 (analogical-qa)
  • 任务2: 解释生成 (explanation-generation)
  • 任务模式: EASY模式和HARD模式

语言

  • 语言: 英语
  • 来源: 翻译自中文版本

数据集结构

数据实例
  • 示例结构: json { "id": "string", "question": "string", "choices": { "label": ["A", "B", "C", "D"], "text": ["string"] }, "answerKey": "string", "explanation": ["string"], "relation": [["string"]] }
数据字段
  • id: 示例的唯一标识符
  • question: 查询词
  • choices: 候选答案词
  • answerKey: 正确答案
  • explanation: 查询和候选答案的解释
  • relation: 查询和候选答案的标注关系
数据分割
  • 分割: 训练集870个实例,验证集119个实例,测试集262个实例

数据集创建

来源数据
  • 来源: 中国的公务员考试
  • 翻译: 机器翻译后人工编辑
注释
  • 注释过程: 未详细说明
  • 注释者: 未详细说明

使用数据的考虑

社会影响
  • 目的: 帮助开发正确的类比推理系统
偏见讨论
  • 偏见: 可能包含偏向中国文化的信息
其他已知限制
  • 限制1: 解释注释过程主要是事后分析,不能提供中间推理的监督
  • 限制2: 每个问题只提供一个可行解释,可能存在多个
  • 限制3: 英语版本可能存在翻译质量问题

附加信息

数据集创建者
  • 创建者: Jiangjie Chen, Rui Xu, Ziquan Fu, Wei Shi, Xinbo Zhang, Changzhi Sun等
许可信息
  • 许可: 未详细说明
引用信息
  • 引用格式: latex @inproceedings{chen-etal-2022-e, title = "{E}-{KAR}: A Benchmark for Rationalizing Natural Language Analogical Reasoning", author = "Chen, Jiangjie and Xu, Rui and Fu, Ziquan and Shi, Wei and Li, Zhongqiao and Zhang, Xinbo and Sun, Changzhi and Li, Lei and Xiao, Yanghua and Zhou, Hao", booktitle = "Findings of the Association for Computational Linguistics: ACL 2022", month = may, year = "2022", address = "Dublin, Ireland", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2022.findings-acl.311", pages = "3941--3955", }
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作