A Dataset of Inertial Measurement Units for Handwritten Punjabi Alphabets
收藏Mendeley Data2026-04-18 收录
下载链接:
https://data.mendeley.com/datasets/jpzz4gch7z
下载链接
链接失效反馈官方服务:
资源简介:
The dataset consists of Inertial Measurement Unit (IMU) data corresponding to the 41 characters of the Punjabi alphabet. The data was collected using an IMU 6050 sensor, which was attached to a marker held by the participant during the handwriting process. The IMU sensor records accelerations along three axes (X, Y, Z) and rotational velocities along the same three axes, providing a comprehensive view of the motion involved in writing each character.
Data Collection Process:
Twenty students participated in the data collection process for this study. Each student was tasked with writing all 41 Punjabi characters twice, once with an IMU sensor attached to the upper part of a marker and once with the sensor attached to the lower part. This dual sensor positioning allowed us to examine whether the location of the sensor affects the distinctiveness of the motion patterns recorded for each character. As a result, each student contributed 82 samples (41 characters × 2 sensor positions), creating a comprehensive dataset that captures a diverse array of motion patterns.
The data collection experiment was conducted over four months, ensuring that a substantial volume of data was gathered. During each session, the students wrote the characters on a whiteboard, repeating each character 250 times. To accurately capture the timing and duration of each writing instance, students were provided with a button that they would press before beginning to write a character and release upon completion. The IMU sensor recorded data for all 250 instances of each character written by each student. This extensive data collection approach ensured that multiple repetitions of each character were captured, resulting in a rich dataset that is ideal for analysis and modeling purposes.
Labeling
The data is labeled according to the sequence of the Punjabi alphabet, with each character assigned a unique label. The first character is labeled '1,' the second character '2,' and so on, up to '41.' This labeling allows for easy identification and classification of the characters within the dataset. '1' represents first letter of the Punjabi letter "ਅ" (Ura), '2' represents the second letter "ਆ" (Aira) and so on.
This dataset can be used to develop and train machine learning models, particularly those focused on pattern recognition and handwriting recognition. Researchers and developers can use this data to:
Data Interpretation and Usage:
Character Recognition: Train models to recognize and classify Punjabi characters based on the IMU data.
Sensor Analysis: Study the effect of sensor positioning on the accuracy of character recognition and explore methods to compensate for these variations.
Handwriting Dynamics: Analyze the dynamics of handwriting, such as speed, pressure, and motion trajectories, as recorded by the IMU sensor.
This dataset provides a valuable resource for exploring the use of inertial sensors in handwriting recognition, specifically for the Punjabi alphabet.
本数据集包含对应旁遮普语字母表41个字符的惯性测量单元(Inertial Measurement Unit,IMU)数据。数据通过IMU 6050传感器采集,该传感器被固定在参与者书写过程中握持的记号笔上。IMU传感器可记录沿X、Y、Z三个轴的加速度,以及沿同一三轴的旋转角速度,全面捕捉书写每个字符时的运动状态。
数据采集流程:
本研究共有20名学生参与数据采集。每名学生需完成全部41个旁遮普字符的书写各两次:一次将传感器固定在记号笔上部,另一次固定在记号笔下部。这种双传感器位置设置旨在探究传感器安装位置是否会影响各字符运动模式的辨识度。据此,每名学生共贡献82组样本(41个字符×2种传感器安装位置),构建出涵盖多样运动模式的完备数据集。
本次数据采集实验历时四个月,确保采集到足量的数据。每次实验中,学生在白板上书写字符,每个字符需重复书写250次。为精准记录每次书写的时间与时长,学生可在开始书写字符前按下按钮,并在书写完成后松开。IMU传感器会记录每名学生书写每个字符的全部250次样本数据。这种大规模数据采集方式确保了每个字符的多次重复书写样本被捕获,最终形成了适用于分析与建模的高质量丰富数据集。
标注规则
本数据集按照旁遮普语字母表顺序进行标注,每个字符均分配唯一标签:第一个字符标记为“1”,第二个为“2”,依此类推直至“41”。该标注方式可实现数据集中字符的快速识别与分类。其中,“1”对应旁遮普语首字母“ਅ”(乌拉),“2”对应第二个字母“ਆ”(艾拉),其余依次类推。
本数据集可用于开发与训练机器学习模型,尤其适用于模式识别与手写识别相关模型。研究人员与开发者可利用该数据开展以下方向的研究:
数据解读与应用场景:
1. 字符识别:基于IMU数据训练模型,实现旁遮普字符的识别与分类。
2. 传感器分析:探究传感器安装位置对字符识别精度的影响,并探索补偿此类差异的方法。
3. 手写动力学分析:分析IMU传感器记录的手写动力学特征,如书写速度、压力与运动轨迹等。
本数据集为探索惯性传感器在手写识别(尤其是旁遮普语字母识别)中的应用提供了宝贵的研究资源。
创建时间:
2024-08-13



