five

kilt_tasks

收藏
魔搭社区2025-12-05 更新2025-05-24 收录
下载链接:
https://modelscope.cn/datasets/facebook/kilt_tasks
下载链接
链接失效反馈
官方服务:
资源简介:
# Dataset Card for KILT ## Table of Contents - [Dataset Description](#dataset-description) - [Dataset Summary](#dataset-summary) - [Supported Tasks and Leaderboards](#supported-tasks-and-leaderboards) - [Languages](#languages) - [Dataset Structure](#dataset-structure) - [Data Instances](#data-instances) - [Data Fields](#data-fields) - [Data Splits](#data-splits) - [Dataset Creation](#dataset-creation) - [Curation Rationale](#curation-rationale) - [Source Data](#source-data) - [Annotations](#annotations) - [Personal and Sensitive Information](#personal-and-sensitive-information) - [Considerations for Using the Data](#considerations-for-using-the-data) - [Social Impact of Dataset](#social-impact-of-dataset) - [Discussion of Biases](#discussion-of-biases) - [Other Known Limitations](#other-known-limitations) - [Additional Information](#additional-information) - [Dataset Curators](#dataset-curators) - [Licensing Information](#licensing-information) - [Citation Information](#citation-information) - [Contributions](#contributions) ## Dataset Description - **Homepage:** https://ai.facebook.com/tools/kilt/ - **Repository:** https://github.com/facebookresearch/KILT - **Paper:** https://arxiv.org/abs/2009.02252 - **Leaderboard:** https://eval.ai/web/challenges/challenge-page/689/leaderboard/ - **Point of Contact:** [Needs More Information] ### Dataset Summary KILT has been built from 11 datasets representing 5 types of tasks: - Fact-checking - Entity linking - Slot filling - Open domain QA - Dialog generation All these datasets have been grounded in a single pre-processed Wikipedia dump, allowing for fairer and more consistent evaluation as well as enabling new task setups such as multitask and transfer learning with minimal effort. KILT also provides tools to analyze and understand the predictions made by models, as well as the evidence they provide for their predictions. #### Loading the KILT knowledge source and task data The original KILT [release](https://github.com/facebookresearch/KILT) only provides question IDs for the TriviaQA task. Using the full dataset requires mapping those back to the TriviaQA questions, which can be done as follows: ```python from datasets import load_dataset # Get the pre-processed Wikipedia knowledge source for kild kilt_wiki = load_dataset("kilt_wikipedia") # Get the KILT task datasets kilt_triviaqa = load_dataset("kilt_tasks", name="triviaqa_support_only") # Most tasks in KILT already have all required data, but KILT-TriviaQA # only provides the question IDs, not the questions themselves. # Thankfully, we can get the original TriviaQA data with: trivia_qa = load_dataset('trivia_qa', 'unfiltered.nocontext') # The KILT IDs can then be mapped to the TriviaQA questions with: triviaqa_map = {} def add_missing_data(x, trivia_qa_subset, triviaqa_map): i = triviaqa_map[x['id']] x['input'] = trivia_qa_subset[i]['question'] x['output']['original_answer'] = trivia_qa_subset[i]['answer']['value'] return x for k in ['train', 'validation', 'test']: triviaqa_map = dict([(q_id, i) for i, q_id in enumerate(trivia_qa[k]['question_id'])]) kilt_triviaqa[k] = kilt_triviaqa[k].filter(lambda x: x['id'] in triviaqa_map) kilt_triviaqa[k] = kilt_triviaqa[k].map(add_missing_data, fn_kwargs=dict(trivia_qa_subset=trivia_qa[k], triviaqa_map=triviaqa_map)) ``` ### Supported Tasks and Leaderboards The dataset supports a leaderboard that evaluates models against task-specific metrics such as F1 or EM, as well as their ability to retrieve supporting information from Wikipedia. The current best performing models can be found [here](https://eval.ai/web/challenges/challenge-page/689/leaderboard/). ### Languages All tasks are in English (`en`). ## Dataset Structure ### Data Instances An example of open-domain QA from the Natural Questions `nq` configuration looks as follows: ``` {'id': '-5004457603684974952', 'input': 'who is playing the halftime show at super bowl 2016', 'meta': {'left_context': '', 'mention': '', 'obj_surface': [], 'partial_evidence': [], 'right_context': '', 'sub_surface': [], 'subj_aliases': [], 'template_questions': []}, 'output': [{'answer': 'Coldplay', 'meta': {'score': 0}, 'provenance': [{'bleu_score': 1.0, 'end_character': 186, 'end_paragraph_id': 1, 'meta': {'annotation_id': '-1', 'evidence_span': [], 'fever_page_id': '', 'fever_sentence_id': -1, 'yes_no_answer': ''}, 'section': 'Section::::Abstract.', 'start_character': 178, 'start_paragraph_id': 1, 'title': 'Super Bowl 50 halftime show', 'wikipedia_id': '45267196'}]}, {'answer': 'Beyoncé', 'meta': {'score': 0}, 'provenance': [{'bleu_score': 1.0, 'end_character': 224, 'end_paragraph_id': 1, 'meta': {'annotation_id': '-1', 'evidence_span': [], 'fever_page_id': '', 'fever_sentence_id': -1, 'yes_no_answer': ''}, 'section': 'Section::::Abstract.', 'start_character': 217, 'start_paragraph_id': 1, 'title': 'Super Bowl 50 halftime show', 'wikipedia_id': '45267196'}]}, {'answer': 'Bruno Mars', 'meta': {'score': 0}, 'provenance': [{'bleu_score': 1.0, 'end_character': 239, 'end_paragraph_id': 1, 'meta': {'annotation_id': '-1', 'evidence_span': [], 'fever_page_id': '', 'fever_sentence_id': -1, 'yes_no_answer': ''}, 'section': 'Section::::Abstract.', 'start_character': 229, 'start_paragraph_id': 1, 'title': 'Super Bowl 50 halftime show', 'wikipedia_id': '45267196'}]}, {'answer': 'Coldplay with special guest performers Beyoncé and Bruno Mars', 'meta': {'score': 0}, 'provenance': []}, {'answer': 'British rock group Coldplay with special guest performers Beyoncé and Bruno Mars', 'meta': {'score': 0}, 'provenance': []}, {'answer': '', 'meta': {'score': 0}, 'provenance': [{'bleu_score': 0.9657992720603943, 'end_character': 341, 'end_paragraph_id': 1, 'meta': {'annotation_id': '2430977867500315580', 'evidence_span': [], 'fever_page_id': '', 'fever_sentence_id': -1, 'yes_no_answer': 'NONE'}, 'section': 'Section::::Abstract.', 'start_character': 0, 'start_paragraph_id': 1, 'title': 'Super Bowl 50 halftime show', 'wikipedia_id': '45267196'}]}, {'answer': '', 'meta': {'score': 0}, 'provenance': [{'bleu_score': -1.0, 'end_character': -1, 'end_paragraph_id': 1, 'meta': {'annotation_id': '-1', 'evidence_span': ['It was headlined by the British rock group Coldplay with special guest performers Beyoncé and Bruno Mars', 'It was headlined by the British rock group Coldplay with special guest performers Beyoncé and Bruno Mars, who previously had headlined the Super Bowl XLVII and Super Bowl XLVIII halftime shows, respectively.', "The Super Bowl 50 Halftime Show took place on February 7, 2016, at Levi's Stadium in Santa Clara, California as part of Super Bowl 50. It was headlined by the British rock group Coldplay with special guest performers Beyoncé and Bruno Mars", "The Super Bowl 50 Halftime Show took place on February 7, 2016, at Levi's Stadium in Santa Clara, California as part of Super Bowl 50. It was headlined by the British rock group Coldplay with special guest performers Beyoncé and Bruno Mars,"], 'fever_page_id': '', 'fever_sentence_id': -1, 'yes_no_answer': ''}, 'section': 'Section::::Abstract.', 'start_character': -1, 'start_paragraph_id': 1, 'title': 'Super Bowl 50 halftime show', 'wikipedia_id': '45267196'}]}]} ``` ### Data Fields Examples from all configurations have the following features: - `input`: a `string` feature representing the query. - `output`: a `list` of features each containing information for an answer, made up of: - `answer`: a `string` feature representing a possible answer. - `provenance`: a `list` of features representing Wikipedia passages that support the `answer`, denoted by: - `title`: a `string` feature, the title of the Wikipedia article the passage was retrieved from. - `section`: a `string` feature, the title of the section in Wikipedia article. - `wikipedia_id`: a `string` feature, a unique identifier for the Wikipedia article. - `start_character`: a `int32` feature. - `start_paragraph_id`: a `int32` feature. - `end_character`: a `int32` feature. - `end_paragraph_id`: a `int32` feature. ### Data Splits The configurations have the following splits: | | Train | Validation | Test | | ----------- | ----------- | ----------- | ----------- | | triviaqa | 61844 | 5359 | 6586 | | fever | 104966 | 10444 | 10100 | | aidayago2 | 18395 | 4784 | 4463 | | wned | | 3396 | 3376 | | cweb | | 5599 | 5543 | | trex | 2284168 | 5000 | 5000 | | structured_zeroshot | 147909 | 3724 | 4966 | | nq | 87372 | 2837 | 1444 | | hotpotqa | 88869 | 5600 | 5569 | | eli5 | 272634 | 1507 | 600 | | wow | 94577 | 3058 | 2944 | ## Dataset Creation ### Curation Rationale [Needs More Information] ### Source Data #### Initial Data Collection and Normalization [Needs More Information] #### Who are the source language producers? [Needs More Information] ### Annotations #### Annotation process [Needs More Information] #### Who are the annotators? [Needs More Information] ### Personal and Sensitive Information [Needs More Information] ## Considerations for Using the Data ### Social Impact of Dataset [Needs More Information] ### Discussion of Biases [Needs More Information] ### Other Known Limitations [Needs More Information] ## Additional Information ### Dataset Curators [Needs More Information] ### Licensing Information [Needs More Information] ### Citation Information Cite as: ``` @inproceedings{kilt_tasks, author = {Fabio Petroni and Aleksandra Piktus and Angela Fan and Patrick S. H. Lewis and Majid Yazdani and Nicola De Cao and James Thorne and Yacine Jernite and Vladimir Karpukhin and Jean Maillard and Vassilis Plachouras and Tim Rockt{\"{a}}schel and Sebastian Riedel}, editor = {Kristina Toutanova and Anna Rumshisky and Luke Zettlemoyer and Dilek Hakkani{-}T{\"{u}}r and Iz Beltagy and Steven Bethard and Ryan Cotterell and Tanmoy Chakraborty and Yichao Zhou}, title = {{KILT:} a Benchmark for Knowledge Intensive Language Tasks}, booktitle = {Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, {NAACL-HLT} 2021, Online, June 6-11, 2021}, pages = {2523--2544}, publisher = {Association for Computational Linguistics}, year = {2021}, url = {https://www.aclweb.org/anthology/2021.naacl-main.200/} } ``` ### Contributions Thanks to [@thomwolf](https://github.com/thomwolf), [@yjernite](https://github.com/yjernite) for adding this dataset.

# KILT数据集卡片 ## 目录 - [数据集描述](#dataset-description) - [数据集概述](#dataset-summary) - [支持的任务与排行榜](#supported-tasks-and-leaderboards) - [语言](#languages) - [数据集结构](#dataset-structure) - [数据实例](#data-instances) - [数据字段](#data-fields) - [数据划分](#data-splits) - [数据集构建](#dataset-creation) - [整理初衷](#curation-rationale) - [源数据](#source-data) - [标注](#annotations) - [个人与敏感信息](#personal-and-sensitive-information) - [数据集使用注意事项](#considerations-for-using-the-data) - [数据集的社会影响](#social-impact-of-dataset) - [偏差讨论](#discussion-of-biases) - [其他已知局限性](#other-known-limitations) - [附加信息](#additional-information) - [数据集整理者](#dataset-curators) - [许可信息](#licensing-information) - [引用信息](#citation-information) - [贡献](#contributions) ## 数据集描述 - **主页:** https://ai.facebook.com/tools/kilt/ - **代码仓库:** https://github.com/facebookresearch/KILT - **论文:** https://arxiv.org/abs/2009.02252 - **排行榜:** https://eval.ai/web/challenges/challenge-page/689/leaderboard/ - **联系人:** [需补充更多信息] ### 数据集概述 KILT由覆盖5类任务的11个数据集构建而成: - 事实核查(fact-checking) - 实体链接(entity linking) - 槽位填充(slot filling) - 开放域问答(open-domain QA) - 对话生成(dialog generation) 所有数据集均基于单一预处理后的维基百科转储文件,可实现更公平、更一致的评估,同时支持多任务学习、迁移学习等新型任务设置,仅需极少工作量即可完成部署。KILT还提供工具用于分析与理解模型预测结果,以及模型生成预测所依据的佐证信息。 #### 加载KILT知识源与任务数据 原始KILT[发布版本](https://github.com/facebookresearch/KILT)仅为TriviaQA任务提供问题ID。如需使用完整数据集,需将这些ID映射回原始TriviaQA问题,操作步骤如下: python from datasets import load_dataset # 获取KILT的预处理维基百科知识源 kilt_wiki = load_dataset("kilt_wikipedia") # 获取KILT任务数据集 kilt_triviaqa = load_dataset("kilt_tasks", name="triviaqa_support_only") # KILT中的大多数任务已包含所有必要数据,但KILT-TriviaQA # 仅提供问题ID,而非问题本身。 # 幸运的是,我们可以通过以下方式获取原始TriviaQA数据: trivia_qa = load_dataset('trivia_qa', 'unfiltered.nocontext') # 随后可通过以下方式将KILT ID映射到TriviaQA问题: triviaqa_map = {} def add_missing_data(x, trivia_qa_subset, triviaqa_map): i = triviaqa_map[x['id']] x['input'] = trivia_qa_subset[i]['question'] x['output']['original_answer'] = trivia_qa_subset[i]['answer']['value'] return x for k in ['train', 'validation', 'test']: triviaqa_map = dict([(q_id, i) for i, q_id in enumerate(trivia_qa[k]['question_id'])]) kilt_triviaqa[k] = kilt_triviaqa[k].filter(lambda x: x['id'] in triviaqa_map) kilt_triviaqa[k] = kilt_triviaqa[k].map(add_missing_data, fn_kwargs=dict(trivia_qa_subset=trivia_qa[k], triviaqa_map=triviaqa_map)) ### 支持的任务与排行榜 该数据集提供排行榜,可基于任务特定指标(如F1值、精确匹配(Exact Match))评估模型性能,同时评估模型从维基百科中检索佐证信息的能力。 当前性能最优的模型可在此处查看:https://eval.ai/web/challenges/challenge-page/689/leaderboard/ ### 语言 所有任务均使用英文(`en`)。 ## 数据集结构 ### 数据实例 以下为自然问题数据集(Natural Questions)`nq`配置下的一个开放域问答示例: {'id': '-5004457603684974952', 'input': 'who is playing the halftime show at super bowl 2016', 'meta': {'left_context': '', 'mention': '', 'obj_surface': [], 'partial_evidence': [], 'right_context': '', 'sub_surface': [], 'subj_aliases': [], 'template_questions': []}, 'output': [{'answer': 'Coldplay', 'meta': {'score': 0}, 'provenance': [{'bleu_score': 1.0, 'end_character': 186, 'end_paragraph_id': 1, 'meta': {'annotation_id': '-1', 'evidence_span': [], 'fever_page_id': '', 'fever_sentence_id': -1, 'yes_no_answer': ''}, 'section': 'Section::::Abstract.', 'start_character': 178, 'start_paragraph_id': 1, 'title': 'Super Bowl 50 halftime show', 'wikipedia_id': '45267196'}]}, {'answer': 'Beyoncé', 'meta': {'score': 0}, 'provenance': [{'bleu_score': 1.0, 'end_character': 224, 'end_paragraph_id': 1, 'meta': {'annotation_id': '-1', 'evidence_span': [], 'fever_page_id': '', 'fever_sentence_id': -1, 'yes_no_answer': ''}, 'section': 'Section::::Abstract.', 'start_character': 217, 'start_paragraph_id': 1, 'title': 'Super Bowl 50 halftime show', 'wikipedia_id': '45267196'}]}, {'answer': 'Bruno Mars', 'meta': {'score': 0}, 'provenance': [{'bleu_score': 1.0, 'end_character': 239, 'end_paragraph_id': 1, 'meta': {'annotation_id': '-1', 'evidence_span': [], 'fever_page_id': '', 'fever_sentence_id': -1, 'yes_no_answer': ''}, 'section': 'Section::::Abstract.', 'start_character': 229, 'start_paragraph_id': 1, 'title': 'Super Bowl 50 halftime show', 'wikipedia_id': '45267196'}]}, {'answer': 'Coldplay with special guest performers Beyoncé and Bruno Mars', 'meta': {'score': 0}, 'provenance': []}, {'answer': 'British rock group Coldplay with special guest performers Beyoncé and Bruno Mars', 'meta': {'score': 0}, 'provenance': []}, {'answer': '', 'meta': {'score': 0}, 'provenance': [{'bleu_score': 0.9657992720603943, 'end_character': 341, 'end_paragraph_id': 1, 'meta': {'annotation_id': '2430977867500315580', 'evidence_span': [], 'fever_page_id': '', 'fever_sentence_id': -1, 'yes_no_answer': 'NONE'}, 'section': 'Section::::Abstract.', 'start_character': 0, 'start_paragraph_id': 1, 'title': 'Super Bowl 50 halftime show', 'wikipedia_id': '45267196'}]}, {'answer': '', 'meta': {'score': 0}, 'provenance': [{'bleu_score': -1.0, 'end_character': -1, 'end_paragraph_id': 1, 'meta': {'annotation_id': '-1', 'evidence_span': ['It was headlined by the British rock group Coldplay with special guest performers Beyoncé and Bruno Mars', 'It was headlined by the British rock group Coldplay with special guest performers Beyoncé and Bruno Mars, who previously had headlined the Super Bowl XLVII and Super Bowl XLVIII halftime shows, respectively.', "The Super Bowl 50 Halftime Show took place on February 7, 2016, at Levi's Stadium in Santa Clara, California as part of Super Bowl 50. It was headlined by the British rock group Coldplay with special guest performers Beyoncé and Bruno Mars", "The Super Bowl 50 Halftime Show took place on February 7, 2016, at Levi's Stadium in Santa Clara, California as part of Super Bowl 50. It was headlined by the British rock group Coldplay with special guest performers Beyoncé and Bruno Mars,"], 'fever_page_id': '', 'fever_sentence_id': -1, 'yes_no_answer': ''}, 'section': 'Section::::Abstract.', 'start_character': -1, 'start_paragraph_id': 1, 'title': 'Super Bowl 50 halftime show', 'wikipedia_id': '45267196'}]}]} ### 数据字段 所有配置的示例均包含以下特征: - `input`:表示查询的字符串特征 - `output`:包含多个答案信息的列表,每个答案由以下部分组成: - `answer`:表示可能答案的字符串特征 - `provenance`:表示支持该答案的维基百科段落的列表,包含以下属性: - `title`:字符串特征,指检索到段落的维基百科文章标题 - `section`:字符串特征,指维基百科文章中的章节标题 - `wikipedia_id`:字符串特征,指维基百科文章的唯一标识符 - `start_character`:int32类型特征,指段落起始字符位置 - `start_paragraph_id`:int32类型特征,指段落起始段落ID - `end_character`:int32类型特征,指段落结束字符位置 - `end_paragraph_id`:int32类型特征,指段落结束段落ID ### 数据划分 各配置的数据划分如下表所示: | 任务配置 | 训练集 | 验证集 | 测试集 | | ----------- | ----------- | ----------- | ----------- | | triviaqa | 61844 | 5359 | 6586 | | fever | 104966 | 10444 | 10100 | | aidayago2 | 18395 | 4784 | 4463 | | wned | | 3396 | 3376 | | cweb | | 5599 | 5543 | | trex | 2284168 | 5000 | 5000 | | structured_zeroshot | 147909 | 3724 | 4966 | | nq | 87372 | 2837 | 1444 | | hotpotqa | 88869 | 5600 | 5569 | | eli5 | 272634 | 1507 | 600 | | wow | 94577 | 3058 | 2944 | ## 数据集构建 ### 整理初衷 [需补充更多信息] ### 源数据 #### 初始数据收集与标准化 [需补充更多信息] #### 源语言生产者是谁? [需补充更多信息] ### 标注 #### 标注流程 [需补充更多信息] #### 标注人员是谁? [需补充更多信息] ### 个人与敏感信息 [需补充更多信息] ## 数据集使用注意事项 ### 数据集的社会影响 [需补充更多信息] ### 偏差讨论 [需补充更多信息] ### 其他已知局限性 [需补充更多信息] ## 附加信息 ### 数据集整理者 [需补充更多信息] ### 许可信息 [需补充更多信息] ### 引用信息 引用格式如下: @inproceedings{kilt_tasks, author = {Fabio Petroni and Aleksandra Piktus and Angela Fan and Patrick S. H. Lewis and Majid Yazdani and Nicola De Cao and James Thorne and Yacine Jernite and Vladimir Karpukhin and Jean Maillard and Vassilis Plachouras and Tim Rockt{"{a}}schel and Sebastian Riedel}, editor = {Kristina Toutanova and Anna Rumshisky and Luke Zettlemoyer and Dilek Hakkani{-}T{"{u}}r and Iz Beltagy and Steven Bethard and Ryan Cotterell and Tanmoy Chakraborty and Yichao Zhou}, title = {{KILT:} a Benchmark for Knowledge Intensive Language Tasks}, booktitle = {Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, {NAACL-HLT} 2021, Online, June 6-11, 2021}, pages = {2523--2544}, publisher = {Association for Computational Linguistics}, year = {2021}, url = {https://www.aclweb.org/anthology/2021.naacl-main.200/} } ### 贡献 感谢[@thomwolf](https://github.com/thomwolf)、[@yjernite](https://github.com/yjernite)为本数据集添加支持。
提供机构:
maas
创建时间:
2025-05-20
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
KILT数据集整合了11个数据集,涵盖事实核查、实体链接、开放域问答等5类知识密集型语言任务,基于统一维基百科数据实现公平评估。该数据集支持多任务学习,提供预测分析工具,所有任务均为英文,包含训练、验证和测试分割。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作