Dataset of Human-written and Synthesized Samples of Free-Text Keystroke Dynamics to Evaluate Liveness Detection Methods

NIAID Data Ecosystem2026-03-14 收录

下载链接：

https://data.mendeley.com/datasets/mzm86rcxxd

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset comprises human-written samples of free-text keystroke dynamics, in the form of sentences in natural language, together with their same-text counterparts synthesized using a variety of methods and degrees of partial knowledge of the legitimate user’s behavior. The human-written samples originate in three publicly available datasets that have been previously used in several keystroke dynamics studies, while the corresponding synthesized samples that share the same keystroke sequences have been forged using a variety of methods that were presented in Nahuel González, Enrique P. Calot, Jorge S. Ierache, Waldo Hasperué, Towards liveness detection in keystroke dynamics: Revealing synthetic forgeries, Systems and Soft Computing, Volume 4, 2022, 200037, ISSN 2772-9419, https://doi.org/10.1016/j.sasc.2022.200037. The source datasets for human-written samples are those of Killourhy and Maxion [2], González and Calot [3], and Banerjee et al. [4]. The first one was used to determine whether composition and transcription tasks produce equivalent results when verifying the identity of the user. The second one was used to evaluate a free-text keystroke dynamics authentication method. The third one was used to find clues of deceptive intent by analyzing variations in typing patterns. For each human-written sample of each source dataset, synthetic samples sharing the same keystroke sequence were created with five different methods and included in the dataset here presented. The objective was to evaluate a liveness detection method that could tell apart the legitimate human from a synthetic forgery of his/her behaviour [1]. For each method, five user profiles were used to create the forgeries, representing the amount of partial knowledge of the legitimate users’ keystroke dynamics an attacker might have. These were a between-subject profile, including only samples from users other than the target were available to the attacker, and four within-subject profiles ranging in size from only 100 keystrokes to all the past samples of the legitimate user. NOTE FOR VERSION 2: The dataset is the same as version 1, but the compression and archiving format has been changed on request of the editors of Data in Brief. The original archive for the dataset was RAR, but it was reuploaded as a ZIP file because the former is not an open access format.

创建时间：

2022-09-13

5,000+

优质数据集

54 个

任务类型

进入经典数据集