hails/agieval-lsat-rc
收藏Hugging Face2024-01-26 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/hails/agieval-lsat-rc
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: query
dtype: string
- name: choices
sequence: string
- name: gold
sequence: int64
splits:
- name: test
num_bytes: 1136305
num_examples: 269
download_size: 322728
dataset_size: 1136305
configs:
- config_name: default
data_files:
- split: test
path: data/test-*
---
# Dataset Card for "agieval-lsat-rc"
Dataset taken from https://github.com/microsoft/AGIEval and processed as in that repo, following dmayhem93/agieval-* datasets on the HF hub.
This dataset contains the contents of the LSAT reading comprehension subtask of AGIEval, as accessed in https://github.com/ruixiangcui/AGIEval/commit/5c77d073fda993f1652eaae3cf5d04cc5fd21d40 .
Citation:
```
@misc{zhong2023agieval,
title={AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models},
author={Wanjun Zhong and Ruixiang Cui and Yiduo Guo and Yaobo Liang and Shuai Lu and Yanlin Wang and Amin Saied and Weizhu Chen and Nan Duan},
year={2023},
eprint={2304.06364},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
Please make sure to cite all the individual datasets in your paper when you use them. We provide the relevant citation information below:
```
@inproceedings{ling-etal-2017-program,
title = "Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems",
author = "Ling, Wang and
Yogatama, Dani and
Dyer, Chris and
Blunsom, Phil",
booktitle = "Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
month = jul,
year = "2017",
address = "Vancouver, Canada",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/P17-1015",
doi = "10.18653/v1/P17-1015",
pages = "158--167",
abstract = "Solving algebraic word problems requires executing a series of arithmetic operations{---}a program{---}to obtain a final answer. However, since programs can be arbitrarily complicated, inducing them directly from question-answer pairs is a formidable challenge. To make this task more feasible, we solve these problems by generating answer rationales, sequences of natural language and human-readable mathematical expressions that derive the final answer through a series of small steps. Although rationales do not explicitly specify programs, they provide a scaffolding for their structure via intermediate milestones. To evaluate our approach, we have created a new 100,000-sample dataset of questions, answers and rationales. Experimental results show that indirect supervision of program learning via answer rationales is a promising strategy for inducing arithmetic programs.",
}
@inproceedings{hendrycksmath2021,
title={Measuring Mathematical Problem Solving With the MATH Dataset},
author={Dan Hendrycks and Collin Burns and Saurav Kadavath and Akul Arora and Steven Basart and Eric Tang and Dawn Song and Jacob Steinhardt},
journal={NeurIPS},
year={2021}
}
@inproceedings{Liu2020LogiQAAC,
title={LogiQA: A Challenge Dataset for Machine Reading Comprehension with Logical Reasoning},
author={Jian Liu and Leyang Cui and Hanmeng Liu and Dandan Huang and Yile Wang and Yue Zhang},
booktitle={International Joint Conference on Artificial Intelligence},
year={2020}
}
@inproceedings{zhong2019jec,
title={JEC-QA: A Legal-Domain Question Answering Dataset},
author={Zhong, Haoxi and Xiao, Chaojun and Tu, Cunchao and Zhang, Tianyang and Liu, Zhiyuan and Sun, Maosong},
booktitle={Proceedings of AAAI},
year={2020},
}
@article{Wang2021FromLT,
title={From LSAT: The Progress and Challenges of Complex Reasoning},
author={Siyuan Wang and Zhongkun Liu and Wanjun Zhong and Ming Zhou and Zhongyu Wei and Zhumin Chen and Nan Duan},
journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing},
year={2021},
volume={30},
pages={2201-2216}
}
```
提供机构:
hails
原始信息汇总
数据集信息
特征
- query: 数据类型为字符串。
- choices: 数据类型为字符串序列。
- gold: 数据类型为整数序列。
数据分割
- test: 包含269个样本,总字节数为1136305。
文件大小
- 下载大小: 322728字节。
- 数据集大小: 1136305字节。
配置
- default: 包含测试数据文件,路径为
data/test-*。
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集是AGIEval基准测试的一部分,专门针对LSAT阅读理解子任务,包含269个文本样本,涉及法律、文化、艺术等多个领域,用于评估人工智能模型在复杂文本上的阅读理解和推理能力。数据集以parquet格式存储,规模较小,源自学术研究项目,旨在提供人类中心化的评估标准。
以上内容由遇见数据集搜集并总结生成



