HiTZ/EuskanolDS
收藏Hugging Face2025-07-17 更新2025-08-09 收录
下载链接:
https://hf-mirror.com/datasets/HiTZ/EuskanolDS
下载链接
链接失效反馈官方服务:
资源简介:
EuskañolDS是一个为巴斯克语和西班牙语代码转换设计的自然源语料库。该语料库通过筛选公开可用的巴斯克和西班牙语语料库中的代码转换文本,并经过语言识别模型的手动验证来构建。它包括两个子集:自动过滤的Silver子集和手动验证的Gold子集。
EuskañolDS is a naturally sourced corpus designed for Basque-Spanish code-switching. The corpus is constructed by filtering code-switching texts from publicly available Basque and Spanish corpora and is manually validated using language identification models. It includes two subsets: the automatically filtered Silver subset and the manually validated Gold subset.
提供机构:
HiTZ



