five

POS Tagging on Handwritten Sindhi Sentences

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://data.mendeley.com/datasets/phk66sgmp5
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset consists of high-resolution images of handwritten Sindhi sentences, meticulously curated for tasks such as Part-of-Speech (POS) tagging and Named Entity Recognition (NER). The dataset aims to facilitate research and development in natural language processing (NLP) and optical character recognition (OCR) for low-resource languages like Sindhi. Key Features: Language: Sindhi (script-based with unique linguistic characteristics). Dataset Size: Contains 1000+ labeled images with diverse handwriting styles. Annotations: Each image is manually annotated for POS tagging and NER tasks, ensuring high accuracy. Applications: Suitable for training and evaluating machine learning models in NLP, OCR, and language understanding. Diversity: Includes variations in sentence length, word structure, and handwriting styles to mimic real-world scenarios.
创建时间:
2024-12-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作