zenless-lab/wikicorpus
收藏Hugging Face2024-12-18 更新2024-12-21 收录
下载链接:
https://hf-mirror.com/datasets/zenless-lab/wikicorpus
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含英语和日语两种语言的文本数据,主要用于自然语言处理任务,如机器翻译或双语文本分析。数据集分为训练集、验证集和测试集,分别包含28061、3787和3588个样本。
This dataset contains text data in both English and Japanese, primarily intended for natural language processing tasks such as machine translation or bilingual text analysis. The dataset is divided into training, validation, and test sets, containing 28061, 3787, and 3588 samples respectively.
提供机构:
zenless-lab



