abacusai/WikiQA-Altered_Numeric_QA

Name: abacusai/WikiQA-Altered_Numeric_QA
Creator: abacusai
Published: 2024-01-17 13:14:42
License: 暂无描述

Hugging Face2024-01-17 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/abacusai/WikiQA-Altered_Numeric_QA

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 configs: - config_name: default data_files: - split: 2k path: data/2k-* - split: 4k path: data/4k-* - split: 8k path: data/8k-* - split: 16k path: data/16k-* dataset_info: features: - name: conversations list: - name: from dtype: string - name: tok_len dtype: int64 - name: value dtype: string splits: - name: 2k num_bytes: 2802096 num_examples: 456 - name: 4k num_bytes: 5492874 num_examples: 456 - name: 8k num_bytes: 10884816 num_examples: 456 - name: 16k num_bytes: 19884934 num_examples: 456 download_size: 8163043 dataset_size: 39064720 --- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c14f6b02e1f8f67c73bd05/_Z4fNfPl_Ix_gGT5Yoi0J.png) # Dataset Card for "WikiQA-Altered_Numeric_QA" The WikiQA task is the task of answering a question based on the information given in a Wikipedia document. We have built upon the short answer format data in Google Natural Questions to construct our QA task. It is formatted as a document and a question. We ensure the answer to the question is a short answer which is either a single word or a small sentence directly cut pasted from the document. Having the task structured as such, we can pinpoint exactly where the LLM was supposed to "look" for the answer in the context, and thus effectively evaluate every part of the expanded context length by carefully placing the answer in different locations. We have selected large Wikipedia documents and have truncated them to get multiple versions of the same document with sizes varying between 2000 to 16000 tokens. For each size of the document, we also have multiple versions which place the question and the answer text at different locations i.e whether it occurs in the first 10%, the bulk or last 10% of the document. Having multiple version of the same document allows us to get a exhaustive and fair evaluation across model sizes, and within one model's context positions since we intrinsically are asking for the same information. A potential issue in a Wikipedia based dataset is that the model could perhaps correctly answer from its pretrained corpus and not from context. To resolve this, we have created another “altered” dataset. This data only consists of questions which have numerical answers. Here, we change the answer and every occurrence of the answer in the document to a different number. Essentially making sure that if the LLM recollects from its pretrained corpus, it gives a wrong answer. The modification is made as follows: If the answer is a year, which is quite frequent, (i.e. is between 1000-2100), we change it to a different random value within +/- 10 of the original value. We treat years as a special case so as to not make the interpretation of the document absurd by messing up choronological information If the answer is any other number, we change it to a different random number which has the same number of digits We call our original QA task [Free Form QA (FFQA)](url=https://huggingface.co/datasets/abacusai/WikiQA-Free_Form_QA) and the altered task Altered Numeric QA (AltQA).

提供机构：

abacusai

原始信息汇总

数据集概述

数据集名称

名称: WikiQA-Altered_Numeric_QA

数据集内容

任务: 基于Wikipedia文档回答问题。
格式: 文档和问题，答案为单个词或短句，直接从文档中提取。
特点: 使用大型Wikipedia文档，截断为不同长度的版本（2000至16000 tokens），并在不同位置放置问题和答案文本。
修改: 对于包含数值答案的问题，将答案和文档中的所有出现替换为不同的数值，确保模型不能从预训练数据中回忆答案。

数据集结构

特征:
- conversations:
  - from: 字符串类型
  - tok_len: 整数类型
  - value: 字符串类型
分割:
- 2k: 456个示例，2802096字节
- 4k: 456个示例，5492874字节
- 8k: 456个示例，10884816字节
- 16k: 456个示例，19884934字节

数据集大小

下载大小: 8163043字节
数据集大小: 39064720字节

许可证

许可证: Apache-2.0

5,000+

优质数据集

54 个

任务类型

进入经典数据集