britllm/TransWebEdu
收藏Hugging Face2025-04-22 更新2025-08-09 收录
下载链接:
https://hf-mirror.com/datasets/britllm/TransWebEdu
下载链接
链接失效反馈官方服务:
资源简介:
TransWebEdu是一个预训练规模的多语言平行语料库,支持十种语言:阿拉伯语、威尔士语、德语、英语、西班牙语、法语、印度尼西亚语、意大利语、俄语和斯瓦希里语。它专门用于从零开始预训练TransWebLLM模型,聚焦于多语言网络教育内容。
TransWebEdu is a pretrain-scale multilingual parallel corpus supporting ten languages: Arabic, Welsh, German, English, Spanish, French, Indonesian, Italian, Russian, and Swahili. It is specifically used for pretraining the TransWebLLM model from scratch, focusing on multilingual web-based educational content.
提供机构:
britllm



