hails/agieval-lsat-rc

Name: hails/agieval-lsat-rc
Creator: hails
Published: 2024-01-26 18:45:21
License: 暂无描述

Hugging Face2024-01-26 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/hails/agieval-lsat-rc

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: query dtype: string - name: choices sequence: string - name: gold sequence: int64 splits: - name: test num_bytes: 1136305 num_examples: 269 download_size: 322728 dataset_size: 1136305 configs: - config_name: default data_files: - split: test path: data/test-* --- # Dataset Card for "agieval-lsat-rc" Dataset taken from https://github.com/microsoft/AGIEval and processed as in that repo, following dmayhem93/agieval-* datasets on the HF hub. This dataset contains the contents of the LSAT reading comprehension subtask of AGIEval, as accessed in https://github.com/ruixiangcui/AGIEval/commit/5c77d073fda993f1652eaae3cf5d04cc5fd21d40 . Citation: ``` @misc{zhong2023agieval, title={AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models}, author={Wanjun Zhong and Ruixiang Cui and Yiduo Guo and Yaobo Liang and Shuai Lu and Yanlin Wang and Amin Saied and Weizhu Chen and Nan Duan}, year={2023}, eprint={2304.06364}, archivePrefix={arXiv}, primaryClass={cs.CL} } ``` Please make sure to cite all the individual datasets in your paper when you use them. We provide the relevant citation information below: ``` @inproceedings{ling-etal-2017-program, title = "Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems", author = "Ling, Wang and Yogatama, Dani and Dyer, Chris and Blunsom, Phil", booktitle = "Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)", month = jul, year = "2017", address = "Vancouver, Canada", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/P17-1015", doi = "10.18653/v1/P17-1015", pages = "158--167", abstract = "Solving algebraic word problems requires executing a series of arithmetic operations{---}a program{---}to obtain a final answer. However, since programs can be arbitrarily complicated, inducing them directly from question-answer pairs is a formidable challenge. To make this task more feasible, we solve these problems by generating answer rationales, sequences of natural language and human-readable mathematical expressions that derive the final answer through a series of small steps. Although rationales do not explicitly specify programs, they provide a scaffolding for their structure via intermediate milestones. To evaluate our approach, we have created a new 100,000-sample dataset of questions, answers and rationales. Experimental results show that indirect supervision of program learning via answer rationales is a promising strategy for inducing arithmetic programs.", } @inproceedings{hendrycksmath2021, title={Measuring Mathematical Problem Solving With the MATH Dataset}, author={Dan Hendrycks and Collin Burns and Saurav Kadavath and Akul Arora and Steven Basart and Eric Tang and Dawn Song and Jacob Steinhardt}, journal={NeurIPS}, year={2021} } @inproceedings{Liu2020LogiQAAC, title={LogiQA: A Challenge Dataset for Machine Reading Comprehension with Logical Reasoning}, author={Jian Liu and Leyang Cui and Hanmeng Liu and Dandan Huang and Yile Wang and Yue Zhang}, booktitle={International Joint Conference on Artificial Intelligence}, year={2020} } @inproceedings{zhong2019jec, title={JEC-QA: A Legal-Domain Question Answering Dataset}, author={Zhong, Haoxi and Xiao, Chaojun and Tu, Cunchao and Zhang, Tianyang and Liu, Zhiyuan and Sun, Maosong}, booktitle={Proceedings of AAAI}, year={2020}, } @article{Wang2021FromLT, title={From LSAT: The Progress and Challenges of Complex Reasoning}, author={Siyuan Wang and Zhongkun Liu and Wanjun Zhong and Ming Zhou and Zhongyu Wei and Zhumin Chen and Nan Duan}, journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing}, year={2021}, volume={30}, pages={2201-2216} } ```

提供机构：

hails

原始信息汇总

数据集信息

特征

query: 数据类型为字符串。
choices: 数据类型为字符串序列。
gold: 数据类型为整数序列。

数据分割

test: 包含269个样本，总字节数为1136305。

文件大小

下载大小: 322728字节。
数据集大小: 1136305字节。

配置

default: 包含测试数据文件，路径为data/test-*。

搜集汇总

数据集介绍

背景与挑战

背景概述

该数据集是AGIEval基准测试的一部分，专门针对LSAT阅读理解子任务，包含269个文本样本，涉及法律、文化、艺术等多个领域，用于评估人工智能模型在复杂文本上的阅读理解和推理能力。数据集以parquet格式存储，规模较小，源自学术研究项目，旨在提供人类中心化的评估标准。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集