Timing distributions in free text keystroke dynamics profiles
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://data.mendeley.com/datasets/sjk7kz35nh
下载链接
链接失效反馈官方服务:
资源简介:
Dataset used in the article "On the shape of timing distributions in free text keystroke dynamics profiles". Contains CSV files with the timing features (hold times and flight times) of every keypress in three free text datasets used in previous studies, by the author (LSIA) and two other unrelated groups (KM from and PROSODY, subdivided in GAY, GUN, and REVIEW). The timing features are grouped by dataset, user, task, virtual key code, and feature. Two different languages are represented, Spanish in LSIA and English in KM and PROSODY.
The original dataset KM was used to compare anomaly-detection algorithms for keystroke dynamics in the article "Comparing anomaly-detection algorithms forkeystroke dynamic" by Killourhy, K.S. and Maxion, R.A. The original dataset PROSODY was used to find cues of deceptive intent by analyzing variations in typing patterns in the article "Keystroke patterns as prosody in digital writings: A case study with deceptive reviews and essay" by Banerjee, R., Feng, S., Kang, J.S., and Choi, Y.
After evaluating seven distributions with two and three parameters separately, the results confirm the established use in the research community of the log normal distribution, in its two and three parameter variations, as excellent choices for modeling the shape of timings histograms in free text keystroke dynamics profiles. However, the log logistic distribution emerges as a clear winner among all two and three--parameter candidates, consistently surpassing the log normal and all the other candidates under the three evaluation criteria for both hold and flight times. It was also shown that tasks and topics do not influence enough the shape of timing histograms to distinguish them, even though the value of their parameters can, as can be seen in the article of Banerjee, R. et. al.
创建时间:
2021-03-03



