five

Sindhi WordNet Dataset

收藏
IEEE2026-04-17 收录
下载链接:
https://ieee-dataport.org/documents/sindhi-wordnet-dataset
下载链接
链接失效反馈
官方服务:
资源简介:
Sindhi WordNet DatasetDeveloped By Abdul Majid Bhurgri Institute of Language Engineering (AMBILE), HyderabadUnder the administrative control of the Culture, Tourism, Antiquities & Archives Department, Government of Sindh.OverviewThe Sindhi WordNet Tagging Dataset contains a collection of Sindhi words, annotated with various linguistic features such as categories, tenses, synonyms, antonyms, and more. This dataset is designed for natural language processing (NLP) tasks, particularly for tasks such as word sense disambiguation, semantic analysis, and syntactic tagging for the Sindhi language.Dataset StructureThe dataset is provided in CSV format and contains the following columns:word_id: A unique identifier for each word entry.word: The Sindhi word.category: The part of speech or syntactic category of the word (e.g., noun, verb, adjective).gender: The gender associated with the word (if applicable).invariants: Information about whether the word is invariant (e.g., whether it has plural or singular forms).tags: The syntactic or semantic tag associated with the word (e.g., conjunction, preposition).tenses: The tense information for the word (if applicable).hyp: Any hypernyms associated with the word.antonyms: Antonyms for the word (if available).synonyms: Synonyms for the word (if available).Exampleword_idwordcategorygenderinvariantstagstenseshypantonymssynonyms1\u06fd---con----2\u06fe---pp----3\u0627\u064e\u0628\u064e\u062f\u064f--singularnoun,adv----4\u0627\u064e\u0628\u064e\u062f\u0650\u064a--singularadj----5\u0627\u064e\u0628\u064e\u062f\u0650\u064a\u062a--singularnoun----FeaturesComprehensive word annotations for various linguistic categories.Multiple syntactic and semantic features such as tense, gender, synonyms, and antonyms.Designed for Sindhi NLP tasks, helping improve language processing for the Sindhi language.UsageThis dataset can be used for a wide range of NLP tasks such as:Part-of-speech tagging.Word sense disambiguation.Semantic analysis.You can load and process this dataset using any standard CSV reader in your preferred programming language (e.g., Python's pandas).AcknowledgmentsSpecial thanks to the AMBILE team for their efforts in data curation, cleaning, formatting, and tagging.Data SourceThe dataset is sourced from the AMBILE WordNet project.LicenseThis dataset is released under the Creative Commons Attribution-NonCommercial 4.0 License. It is intended for educational and research purposes only.ContactFor any queries, collaboration opportunities, or contributions, please contact:Email: datasets@sindh.ai
提供机构:
Abdul Majid Bhurgri Institute of Language Engineering Hyderabad
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作