five

BDSHWA: Bengali–English Online Handwriting Dataset for Forensic Biometric Analysis

收藏
DataCite Commons2026-04-28 更新2026-05-04 收录
下载链接:
https://data.mendeley.com/datasets/99t9jhvksv/1
下载链接
链接失效反馈
官方服务:
资源简介:
We hypothesize that using online handwriting kinematics (stress, tilt, velocity, temporal dynamics) encodes individual-hundreds of thousands neuromuscular patterns considerably discriminatory than devoid static images. By accomplishing forensic-grade performance under true participant-level generalization on writer identification, gender prediction, and age group classification by fusing such kinematic features with deep CNN visual representations. BDSHWA is composed of over 1,338 samples from 29 participants collected by Wacom Cintiq 22 at 100 Hz. Key Results: writer classification (95.7% accuracy, MACRO AUC 0.999), optimal split for topics+ freehand; gender prediction (91.7%, leave-one-participant-out CV, best sentencebased prediction, KURTOSIS of average max/min pressure/IQR); age group classification (80.0% under LOO-CV; CNN-combined features suffice as best sentence-based predictions with cumulative convolutions through the convolutional neural networks). Pen pressure statistics consistently rank as top discriminative features across all tasks, validating forensic examination principles. Single acquisition of first bilingual Bengali–English online dataset. Participant-level CV indicates ~20–25 point increase in accuracy over sample-level gender studies. Extensive 63-combination search yields category optima per task. Early fusion controls writer ID/gender; CNN-combined alone suffices on age suggesting age shifts are visually salient whereas identity/gender cloak themselves in kinematic subtleties. Statistical rigor is assured by permutation tests (p<0.001) and bootstrap CIs. The BDSHWA dataset was collected from 29 university students (17 male, 12 female; 59% male / 41% female; ages 20–26, mean 22.5 years; 28 right-handed, 1 left-handed). The cohort spans two age groups used in the primary classification experiments: 20–22 (n = 14) and 23–25 (n = 11) among the 25 participants included in the age-group task after quality filtering. The gender balance provides roughly equal representation for binary classification (chance = 50%), while the two-group age partition creates a binary classification task with chance = 50%. A customized Python 3.10/Tkinter application was developed from scratch to collect data on a Wacom Cintiq 22 (DTK-2260) with an Intuos Pro Pen (KP-503E). Each participant completed structured writing tasks across six categories in both Bengali and English: (1) Sentence: five fixed different sentences; (2) Topic: three spontaneous composition tasks; (3) Freehand: five free-form phrases; (4) Word: isolated vocabulary items; (5) Shape: geometric tracing patterns; (6) Wave: rhythmic stroke patterns.
提供机构:
Mendeley Data
创建时间:
2026-04-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作