Jingmiao/PUZZLEQA
收藏Hugging Face2023-06-28 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/Jingmiao/PUZZLEQA
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
license: apache-2.0
---
### Acknowledgements
The PUZZLEQA is scraped from [NPR Sunday Puzzle Official Website](https://www.npr.org/series/4473090/sunday-puzzle) and [NPR Puzzle Synopsis](https://groups.google.com/g/nprpuzzle),
made by a group of fans by running a mailing list that distributed questions and answers for each week’s puzzle.
The authors of the dataset cleaned the data and made some multiple choice based on the question and answers.
### Creation
The Multiple Choice Dataset is generated from PUZZLEQA dataset using the following algorithm.
1. Read the fr_big_exp.tsv.tsv file
2. Group rule-question-answer triples in a given Sunday together (so the rules of each question will be the same)
3. For each question, randomly select three other answers from answers on the same Sunday. Shuffle 3 selected answers with the correct answer for the given question to obtain 4 choices for this question. \\
4. identify the correct answer for the given question as the "gold" answer.
Recent.tsv is the dataset based on the NPR PUZZLE in 2023.
# Citation
@inproceedings{zhao2023solving,
title={Solving and Generating NPR Sunday Puzzles with Large Language Models},
author={Jingmiao Zhao and Carolyn Jane Anderson},
year={2023},
eprint={2306.12255},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
提供机构:
Jingmiao
原始信息汇总
数据集概述
数据来源
- 数据集PUZZLEQA来源于NPR Sunday Puzzle Official Website和NPR Puzzle Synopsis。
- 由一群粉丝通过邮件列表分发每周谜题的问题和答案。
数据处理
- 数据集作者对数据进行了清洗,并根据问题和答案制作了多项选择题。
数据生成算法
- 读取
fr_big_exp.tsv.tsv文件。 - 将同一周日的问题-规则-答案三元组进行分组。
- 对于每个问题,从同一周日的答案中随机选择三个其他答案,并与正确答案一起洗牌,形成四个选择。
- 将给定问题的正确答案标识为“gold”答案。
数据集文件
Recent.tsv:基于2023年NPR PUZZLE的数据集。
引用信息
@inproceedings{zhao2023solving, title={Solving and Generating NPR Sunday Puzzles with Large Language Models}, author={Jingmiao Zhao and Carolyn Jane Anderson}, year={2023}, eprint={2306.12255}, archivePrefix={arXiv}, primaryClass={cs.CL} }



