POS Hindi English
收藏India Data2025-07-23 更新2026-05-16 收录
下载链接:
https://india-data.org/dataset-details/b3425958-3c96-4aa9-a61b-62d35ef0a9ea
下载链接
链接失效反馈官方服务:
资源简介:
Each token in the dataset is labeled with a language code like 'en' for English, 'hi' for Hindi, and 'rest' for other or unclassified tokens. Corresponding entity tags follow the BIO tagging format, denoting the beginning (B), inside (I), or outside (O) of named entities such as PERSON, PLACE, or ORGANISATION. The data is sourced from social platforms like Twitter and includes features like mentions, hashtags, emojis, and links, reflecting real-world language complexity.
提供机构:
Natural Language Processing (NLP)
创建时间:
2025-06-03



