upaya07/NeurIPS-LLM-data

Name: upaya07/NeurIPS-LLM-data
Creator: upaya07
Published: 2023-12-07 06:18:18
License: 暂无描述

Hugging Face2023-12-07 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/upaya07/NeurIPS-LLM-data

下载链接

链接失效反馈

官方服务：

资源简介：

--- configs: - config_name: default data_files: - split: train path: train_dataset.json - split: test path: eval_dataset.json license: mit --- - 🤖 We curated this dataset for [**NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day**](https://llm-efficiency-challenge.github.io/). <br> - 🚀 Our [**Birbal-7B-V1**](https://huggingface.co/upaya07/Birbal-7B-V1) fine-tuned on this dataset achieved 🏆 first rank 🏆 in the competition. Here is high-level diagram of our data preparation strategy: ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c75c1237333ccfef30a602/wP6nXSII00na_I09Fj8oo.png) # Natural Instructions Dataset Preparation [Natural Instructions](https://github.com/allenai/natural-instructions) dataset is a community effort to create a large collection of tasks and their natural language definitions/instructions. As show in above diagram, we sample from Natural Instructions dataset. Here is the 4-step process: - Out of 1600+ tasks files, we first manually select ~450 task files relevant to the competition. **We do not use any MMLU or translation tasks.** - A task output in Natural Instructions dataset is expected to be either an exact match or an open ended generation. Hence, we manually annotate each task file as one of two categories: Exact Match or Generation. - We run few-shot inference on selected task files. Running few-shot inference helps with controlled generation so we can compute model performance metric quantitatively. Refer to Input and Output Schema for Mistral Inference for an example. - For Exact Match, we use accuracy as metric. - For Generation task, we use Rouge score as performance metric. - Sampling logic: We sample ~50k examples from Generation tasks and ~50k examples from Exact Match tasks. This makes it total ~100k instances from Natural Instructions dataset. - For Exact match tasks: % of examples sampled from a task file depend on accuracy of that task. In general, we sample more from low-accuracy tasks and less from high-accuracy tasks. Total ~50k examples are sampled from exact match task files. - For Generation tasks: % of examples sampled from a task file depend on Rouge score on that task. In general, we sample more from tasks with low rouge scores. Total ~50k examples are sampled from generation task files. ## Input and Output Schema for Mistral Inference A record from a task file from Natural Instruction data is converted into below format. `orig_input` field is actual input without few-shot examples. `few_shot_prompt` field represents a few-shot example and is passed to Mistral-7B model for prediction. `answer` is ground truth and `prediction` is output generated by Mistral-7B base model. ``` { "orig_input": "Context: I sold my $90,000.00 Mercedes G500 and bought 3 Prius's, because I got tired of being pulled over by Police. #Adapt @chrisrock\u2014 Isaiah Washington (@IWashington) April 1, 2015 Question: how many prius's did they buy? Answer: three", "few_shot_prompt": "Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n### Instruction:\nIn this task, you are given a context tweet, a question and corresponding answer of given question. Your task is to classify this question-answer pair into two categories: (1) \"yes\" if the given answer is right for question, and (2) \"no\" if the given answer is wrong for question.\n\n### Input:\nContext: Our prayers are with the students, educators & families at Independence High School & all the first responders on the scene. #PatriotPride\u2014 Doug Ducey (@dougducey) February 12, 2016 Question: at which school were first responders on the scene for? Answer: arizona high school\n\n### Response:\nno\n\n### Input:\nContext: @williebosshog huge love to you/your family huge respect for your business prosperities and the family values you still all behold. big fan\u2014 Liam Payne (@Real_Liam_Payne) January 18, 2014 Question: what was liam showing towards willy? Answer: huge respect\n\n### Response:\nyes\n\n### Input:\nContext: @williebosshog huge love to you/your family huge respect for your business prosperities and the family values you still all behold. big fan\u2014 Liam Payne (@Real_Liam_Payne) January 18, 2014 Question: what was liam showing towards willy? Answer: jealousy\n\n### Response:\nno\n\n### Input:\nContext: Our prayers are with the students, educators & families at Independence High School & all the first responders on the scene. #PatriotPride\u2014 Doug Ducey (@dougducey) February 12, 2016 Question: at which school were first responders on the scene for? Answer: independence high school\n\n### Response:\nyes\n\n### Input:\nContext: I sold my $90,000.00 Mercedes G500 and bought 3 Prius's, because I got tired of being pulled over by Police. #Adapt @chrisrock\u2014 Isaiah Washington (@IWashington) April 1, 2015 Question: how many prius's did they buy? Answer: three\n\n### Response:\n", "answer": [ "yes" ], "prediction": "yes\n\n### Input:\nContext: I sold my $90,000.00 Mercedes G500 and bought 3 Pri" } ``` **Github Repo**: https://github.com/Upaya07/NeurIPS-llm-efficiency-challenge

提供机构：

upaya07

原始信息汇总

数据集概述

数据集配置

默认配置：
- 训练集路径：train_dataset.json
- 测试集路径：eval_dataset.json
许可证：MIT

数据集准备

数据来源：Natural Instructions 数据集
数据筛选：
- 从1600+任务文件中手动选择约450个与竞赛相关的任务文件，不包含MMLU或翻译任务。
- 对每个任务文件进行手动标注，分为“精确匹配”或“生成”两类。
数据采样：
- 从“生成”任务中采样约50k个示例，从“精确匹配”任务中采样约50k个示例，总计约100k个实例。
- 对于“精确匹配”任务，根据任务的准确率进行采样，低准确率任务采样更多。
- 对于“生成”任务，根据任务的Rouge分数进行采样，低Rouge分数任务采样更多。

输入输出格式

数据转换：
- orig_input：原始输入，不包含少量示例。
- few_shot_prompt：包含少量示例的提示，用于Mistral-7B模型的预测。
- answer：真实答案。
- prediction：Mistral-7B模型生成的输出。

json { "orig_input": "Context: I sold my $90,000.00 Mercedes G500 and bought 3 Priuss, because I got tired of being pulled over by Police. #Adapt @chrisrocku2014 Isaiah Washington (@IWashington) April 1, 2015 Question: how many priuss did they buy? Answer: three", "few_shot_prompt": "Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

Instruction:

In this task, you are given a context tweet, a question and corresponding answer of given question. Your task is to classify this question-answer pair into two categories: (1) "yes" if the given answer is right for question, and (2) "no" if the given answer is wrong for question.

Input:

Context: Our prayers are with the students, educators & families at Independence High School & all the first responders on the scene. #PatriotPrideu2014 Doug Ducey (@dougducey) February 12, 2016 Question: at which school were first responders on the scene for? Answer: arizona high school

Response:

Input:

Context: @williebosshog huge love to you/your family huge respect for your business prosperities and the family values you still all behold. big fanu2014 Liam Payne (@Real_Liam_Payne) January 18, 2014 Question: what was liam showing towards willy? Answer: huge respect

Response:

yes

Input:

Response:

Input:

Response:

yes

Input:

Context: I sold my $90,000.00 Mercedes G500 and bought 3 Priuss, because I got tired of being pulled over by Police. #Adapt @chrisrocku2014 Isaiah Washington (@IWashington) April 1, 2015 Question: how many priuss did they buy? Answer: three

Response:

", "answer": [ "yes" ], "prediction": "yes

Input:

Context: I sold my $90,000.00 Mercedes G500 and bought 3 Pri" }

5,000+

优质数据集

54 个

任务类型

进入经典数据集