BeitTigreAI/tigre-data-fasttext

Name: BeitTigreAI/tigre-data-fasttext
Creator: BeitTigreAI
Published: 2025-11-19 01:34:49
License: 暂无描述

Hugging Face2025-11-19 更新2025-11-30 收录

下载链接：

https://hf-mirror.com/datasets/BeitTigreAI/tigre-data-fasttext

下载链接

链接失效反馈

官方服务：

资源简介：

Tigre Word Embedding Models (FastText)数据集是第一个全面的公共Tigre语言资源集合，Tigre是一种在阿非罗亚细亚语系中的南闪米特语族下的未充分研究的语言。该数据集包含了多种模态（文本+语音），并为语言建模、自动语音识别和机器翻译等多个核心自然语言处理任务提供了基线模型。这些模型在后续的自然语言处理任务中非常有价值，尤其是涉及这种低资源语言的任务。

The Tigre Word Embedding Models (FastText) dataset is the first comprehensive public collection of resources for the Tigre language, which is an under-resourced South Semitic language within the Afro-Asiatic family. The dataset includes multiple modalities (text + speech) and provides baseline models for several core NLP tasks such as language modeling, ASR, and machine translation. These models are valuable for any downstream NLP task, especially those involving this low-resource language.

提供机构：

BeitTigreAI

5,000+

优质数据集

54 个

任务类型

进入经典数据集