msakarvadia/handwritten_multihop_reasoning_data

Name: msakarvadia/handwritten_multihop_reasoning_data
Creator: msakarvadia
Published: 2024-01-09 15:30:53
License: 暂无描述

Hugging Face2024-01-09 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/msakarvadia/handwritten_multihop_reasoning_data

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: mit --- # Dataset used to better understand how to: ## Correct Multi-Hop Reasoning Failures during Inference in Transformer-Based Language Models This is a handwritten dataset created to aid in better understanding the multi-hop reasoning capabilities of LLMs. To learn how the dataset was constructed please check out the project page, paper, and demo linked below. This is the link to the [Project Page](https://msakarvadia.github.io/memory_injections/). This repo contains the code that was used to conduct the experiments in this [paper](https://arxiv.org/abs/2309.05605). To get a quick introduction to the methods used in this work, checkout this [`demo`](https://colab.research.google.com/drive/1H1jjrdMDRoGj5qRGvAuWuwq1dgIDWjQw?usp=sharing). This demo is also linked under the `demos` folder in this repo. Answering multi-hop reasoning questions requires retrieving and synthesizing information from diverse sources. Large Language Models (LLMs) struggle to perform such reasoning consistently. Here we propose an approach to pinpoint and rectify multi-hop reasoning failures through targeted memory injections on LLM attention heads. First, we analyze the per-layer activations of GPT-2 models in response to single and multi-hop prompts. We then propose a mechanism that allows users to inject pertinent prompt-specific information, which we refer to as "memories," at critical LLM locations during inference. By thus enabling the LLM to incorporate additional relevant information during inference, we enhance the quality of multi-hop prompt completions. We show empirically that a simple, efficient, and targeted memory injection into a key attention layer can often increase the probability of the desired next token in multi-hop tasks, by up to 424%. ![picture](https://drive.google.com/uc?export=view&id=11PXMPvywR_ZtQNLM615-KB7ltfc0yivM) ## Citation If you use this dataset, please cite our work as: ``` @article{sakarvadia2023memory, title={Memory Injections: Correcting Multi-Hop Reasoning Failures during Inference in Transformer-Based Language Models}, author={Sakarvadia, Mansi and Ajith, Aswathy and Khan, Arham and Grzenda, Daniel and Hudson, Nathaniel and Bauer, Andr{\'e} and Chard, Kyle and Foster, Ian}, journal={arXiv preprint arXiv:2309.05605}, year={2023} } ```

提供机构：

msakarvadia

原始信息汇总

数据集概述

数据集用途

该数据集旨在帮助理解大型语言模型（LLMs）在推理过程中的多跳推理能力，特别是如何纠正基于Transformer的语言模型在推理时的多跳推理失败。

数据集构造

数据集是手工创建的，用于分析GPT-2模型在单跳和多跳提示下的每层激活情况。通过在关键的LLM位置注入特定于提示的相关信息（称为“记忆”），可以增强多跳提示完成的质量。

实验结果

实验表明，通过在关键注意力层中进行简单、高效且有针对性的记忆注入，可以显著提高多跳任务中期望的下一个令牌的概率，最高可达424%。

引用

如果使用该数据集，请引用以下论文：

@article{sakarvadia2023memory, title={Memory Injections: Correcting Multi-Hop Reasoning Failures during Inference in Transformer-Based Language Models}, author={Sakarvadia, Mansi and Ajith, Aswathy and Khan, Arham and Grzenda, Daniel and Hudson, Nathaniel and Bauer, Andr{e} and Chard, Kyle and Foster, Ian}, journal={arXiv preprint arXiv:2309.05605}, year={2023} }

5,000+

优质数据集

54 个

任务类型

进入经典数据集