emozilla/sat-reading
收藏Hugging Face2023-02-20 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/emozilla/sat-reading
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: text
dtype: string
- name: answer
dtype: string
- name: requires_line
dtype: bool
- name: id
dtype: string
splits:
- name: train
num_bytes: 1399648
num_examples: 298
- name: test
num_bytes: 196027
num_examples: 38
- name: validation
num_bytes: 183162
num_examples: 39
download_size: 365469
dataset_size: 1778837
language:
- en
---
# Dataset Card for "sat-reading"
This dataset contains the passages and questions from the Reading part of ten publicly available SAT Practice Tests.
For more information see the blog post [Language Models vs. The SAT Reading Test](https://jeffq.com/blog/language-models-vs-the-sat-reading-test).
For each question, the reading passage from the section it is contained in is prefixed.
Then, the question is prompted with `Question #:`, followed by the four possible answers.
Each entry ends with `Answer:`.
Questions which reference a diagram, chart, table, etc. have been removed (typically three per test).
In addition, there is a boolean `requires_line` feature, which indiciates if the question references specific lines within the passage.
To maintain generalizability in finetuning scenarios, `SAT READING COMPREHENSION TEST` appears at the beginning of each entry -- it may be desireable to remove this depending on your intentions.
Eight tests appear in the training split; one each in the validation and test splits.
提供机构:
emozilla
原始信息汇总
数据集概述
数据集名称
"sat-reading"
数据集内容
包含十份公开SAT练习测试中的阅读部分的文章和问题。
数据集特征
- text (字符串类型)
- answer (字符串类型)
- requires_line (布尔类型)
- id (字符串类型)
数据集划分
- 训练集 (train): 298个样本,1399648字节
- 测试集 (test): 38个样本,196027字节
- 验证集 (validation): 39个样本,183162字节
数据集大小
- 下载大小: 365469字节
- 数据集总大小: 1778837字节
语言
- 英语 (en)



