ALTACambridge/KUPA-KEYS
收藏Hugging Face2024-05-17 更新2024-06-11 收录
下载链接:
https://hf-mirror.com/datasets/ALTACambridge/KUPA-KEYS
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-sa-4.0
task_categories:
- text-classification
language:
- en
pretty_name: KUPA-KEYS
size_categories:
- 1K<n<10K
---
This repository hosts the dataset collected during the project, 'Deep Learning for Language Assessment', as detailed in the paper "Logging Keystrokes in Writing by English Learners", to appear in the proceedings of LREC-COLING 2024.
The dataset is named **KUPA-KEYS** (King's College London & Université Paris Cité Keys). It contains texts written by 1,006 participants in our crowdsourcing study, recruited on Prolific. Task 1 involved a text-copy task; Task 2 involved essay writing in response to a 'Just for Fun' prompt from [Write & Improve](https://writeandimprove.com/) (W&I), used with permission. Keystroke data for these texts are included in the dataset, as well as metadata and CEFR level grades for the free-text essays. Further details about the data collection process, annotation and analysis may be found in our LREC-COLING paper.
Contents:
* KUPA-KEYS-META.csv : information about each participant, including computing environment & keyboard layout, education & level of English, task 1 and task 2 statistics, the essay prompt for task 2, the final form of their task 2 essay, and the scores / CEFR levels received from human markers (h1, h2, h3) and the W&I automarker (a0).
* KUPA-KEYS-TASK-1.csv : all keystroke events for each participant undertaking task 1, the text-copy task.
* KUPA-KEYS-TASK-2.csv : all keystroke events for each participant undertaking task 2, the essay writing task.
For more information about the contents of the files, see our paper forthcoming at LREC-COLING 2024, or the [DatasetDescription](https://huggingface.co/datasets/ALTACambridge/KUPA-KEYS/blob/main/DatasetDescription.md) page.
_Georgios Velentzas, Andrew Caines, Rita Borgo, Erin Pacquetet, Clive Hamilton, Taylor Arnold, Diane Nicholls, Paula Buttery, Thomas Gaillat, Nicolas Ballier and Helen Yannakoudakis_
提供机构:
ALTACambridge
原始信息汇总
数据集概述
数据集名称
- KUPA-KEYS (Kings College London & Université Paris Cité Keys)
数据集描述
- 包含1,006名参与者的文本数据,这些参与者是通过Prolific招募的。
- 数据集包括两个任务:
- 任务1:文本复制任务
- 任务2:根据Write & Improve的“Just for Fun”提示进行论文写作。
- 数据集包含按键数据、元数据以及自由文本论文的CEFR等级评分。
数据集文件
- KUPA-KEYS-META.csv:包含每个参与者的信息,包括计算环境、键盘布局、教育背景、英语水平、任务1和任务2的统计数据、任务2的论文提示、最终形式的任务2论文以及从人工标记者(h1, h2, h3)和W&I自动标记者(a0)获得的分数/CEFR等级。
- KUPA-KEYS-TASK-1.csv:任务1中每个参与者的所有按键事件。
- KUPA-KEYS-TASK-2.csv:任务2中每个参与者的所有按键事件。
数据集详情
- 详细信息可在LREC-COLING 2024会议论文中找到,或访问DatasetDescription页面。



