Michielo/Merged-LID-20
收藏Hugging Face2025-01-17 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/Michielo/Merged-LID-20
下载链接
链接失效反馈官方服务:
资源简介:
Merged-LID-20数据集是一个优化用于构建和训练语言识别模型的数据集集合,包含20种语言的特定数据集。每种语言的数据集包括文本数据和对应的语言标签,适用于多语言自然语言处理任务中的语言识别。
The Merged-LID-20 dataset is a collection optimized for building and training language identification models, consisting of language-specific datasets for 20 different languages. Each language dataset includes text data and corresponding language labels, suitable for language identification tasks in multilingual natural language processing.
提供机构:
Michielo



