POS Tagging on Handwritten Sindhi Sentences
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://data.mendeley.com/datasets/phk66sgmp5
下载链接
链接失效反馈官方服务:
资源简介:
This dataset consists of high-resolution images of handwritten Sindhi sentences, meticulously curated for tasks such as Part-of-Speech (POS) tagging and Named Entity Recognition (NER). The dataset aims to facilitate research and development in natural language processing (NLP) and optical character recognition (OCR) for low-resource languages like Sindhi.
Key Features:
Language: Sindhi (script-based with unique linguistic characteristics).
Dataset Size: Contains 1000+ labeled images with diverse handwriting styles.
Annotations: Each image is manually annotated for POS tagging and NER tasks, ensuring high accuracy.
Applications: Suitable for training and evaluating machine learning models in NLP, OCR, and language understanding.
Diversity: Includes variations in sentence length, word structure, and handwriting styles to mimic real-world scenarios.
创建时间:
2024-12-30



