Inputlog Copy Task Corpus: Exploring and defining typing skills
收藏NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/5803400
下载链接
链接失效反馈官方服务:
资源简介:
Context
One of the components that is included in the keystroke logging program Inputlog (https://www.inputlog.net) is the Copy Task component. It consists of a multi-layered set of tasks that measure a person's typing skill:
Tapping task
press the ‘d’ and ‘k’ key alternatively during 15 s
Sentence
copy a sentence during 30 s
Word combination 1
copy a combination of three words seven times
Word combination 2
copy a combination of three words seven times
Word combination 3
copy a combination of three words seven times
Word combination 4
copy a combination of three words seven times
Consonant groups
copy four blocks of six consonants once
The task is currently made available in twelve languages.
For more information: https://doi.org/10.5334/jors.234
Interactive Dashboard
Visit the webpage with an interactive dashboard to explore, filter, and download the +5K copy task corpus.
website: https://www.inputlog.net/copy-task/
dashboard: https://inputlog-analysis.uantwerpen.be/expert
Corpus
We are happy to make a multilingual corpus available (open access) that currently consists of more than 5000 copy tasks.
The + 5K corpus is carefully cleaned and fully anonymized.
The Shiny interface allows users to filter the corpus based on about 10 variables.
The selection can be downloaded in different formats and levels of aggregation (from raw idfx to synthesized analysis).
The selection can be explored using different interactive graph visualizations.
Researchers can upload their own corpus (or single copy task file) and compare it to the (selected) corpus.
An extra webpage is designed for laypersons wanting to take a copy task to test their typing skills. They get dashboard feedback in a user-friendly and attractive way and can compare their performance with (age-related) participants in the corpus. (Specially designed to further expand the corpus).
Facts and Figures
Some facts and figures about the corpus' composition:
Languages:
Dutch 3130 files
English 1163 files
German 281 files
French 201 files
Other 378 file
Gender
Female: 3495 files
Male: 1276 files
X or missing 382 files
Age
15- 439 files
16-20 1591 files
21-25 2427 files
26-35 478 files
36-45 126 files
46+ 230 files
A subset of the total corpus has been uploaded here. The subset contains a dataset of about 500 tests (English | 21-25-year-olds).
创建时间:
2022-01-03



