mteb/HinDialectClassification
收藏Hugging Face2025-05-06 更新2025-05-31 收录
下载链接:
https://hf-mirror.com/datasets/mteb/HinDialectClassification
下载链接
链接失效反馈官方服务:
资源简介:
HinDialectClassification是一个包含26种北印度印地语相关语言和方言的数据集。该数据集由专家进行标注,并遵循CC BY-SA 4.0许可协议。数据集为单语种,具有文本和标签等特征,并分为训练集和测试集。该数据集与MTEB基准相关联,并可以使用MTEB框架进行评估。
HinDialectClassification is a dataset of 26 Hindi-related languages and dialects of the Indic Continuum in North India. The dataset is expert-annotated and is licensed under CC BY-SA 4.0. It is a monolingual dataset with features including text and label, and is split into train and test sets. The dataset is associated with the MTEB benchmark and can be evaluated using the MTEB framework.
提供机构:
mteb



