mohibkhansherwani/DynamicWordLevelPakistanSignLanguageDataset

Name: mohibkhansherwani/DynamicWordLevelPakistanSignLanguageDataset
Creator: mohibkhansherwani
Published: 2026-03-27 13:17:59
License: 暂无描述

Hugging Face2026-03-27 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/mohibkhansherwani/DynamicWordLevelPakistanSignLanguageDataset

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 task_categories: - tabular-classification - feature-extraction language: - ur - en tags: - sign-language - mediapipe - landmarks - pakistan - gesture-recognition pretty_name: Dynamic Word Level Pakistan Sign Language (PSL) Dataset size_categories: - 1K<n<10K --- # Dynamic Word Level Pakistan Sign Language (PSL) Dataset ## Overview This dataset contains **MediaPipe hand landmark sequences** for 60+ words in Pakistan Sign Language (PSL). It is designed to support research into dynamic, word-level gesture recognition. Unlike existing datasets that focus on static finger spelling or small vocabularies, this project provides a high-quality, research-ready landmark collection for complex sign language translation. **Kaggle Link** https://www.kaggle.com/datasets/mohib123456/dynamic-word-level-pakistan-sign-language-dataset/data **Github Project Link** https://github.com/MohibUllahKhanSherwani/SignSpeak_FYP ### Dataset Statistics - **Total Signs:** 60+ unique word classes. - **Samples per Sign:** Each sign is performed 70 times (Combined). - **Sequence Length:** Each gesture is fixed at 60 frames for temporal consistency. - **Format:** Landmark coordinate data (X, Y, Z) extracted via MediaPipe. ## Dataset Structure & Generalization The data is organized into two primary subsets to ensure model robustness across different hardware and camera types: 1. **MP_Data:** contains 50 samples per sign recorded using standard webcams and fixed desktop cameras. 2. **MP_Data_mobile:** contains 20 samples per sign recorded using mobile phone cameras to introduce varied angles, motion blur, and lighting. By training on both sets, models are less likely to overfit to a specific lens or environment, making them more suitable for real-world mobile applications like **SignSpeak**. ## Motivation & Rationale Most existing PSL datasets are limited to static signs or small vocabularies. We collected this word-level dataset specifically for the SignSpeak project. - Privacy: No raw video recordings were saved to protect participant identity. Only anonymized MediaPipe landmarks were kept. - Scale: Large vocabulary (60+) to move beyond simple finger-spelling ## Usage This data is ideal for training sequence-based architectures like **LSTMs, GRUs, or Transformers** for temporal classification. The primary use case is for building real-time sign language translation tools for the Deaf community in Pakistan. ### Loading the Data For details on how to use the dataset and sample loading and training script please visit: https://www.kaggle.com/datasets/mohib123456/dynamic-word-level-pakistan-sign-language-dataset/data ## Authors This dataset was curated by the **SignSpeak** team as part of our Final Year Project at **COMSATS University Islamabad, Abbottabad Campus**. - **Main Author:** Mohib Ullah Khan Sherwani - **Repository:** https://github.com/MohibUllahKhanSherwani/SignSpeak-FYP ## License This dataset is released under the **Apache License 2.0**. You are free to use, modify, and distribute this data for both research and commercial purposes, provided that attribution is given to the authors.

提供机构：

mohibkhansherwani

5,000+

优质数据集

54 个

任务类型

进入经典数据集