BlackKakapo/sentences-ro
收藏Hugging Face2025-09-25 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/BlackKakapo/sentences-ro
下载链接
链接失效反馈官方服务:
资源简介:
罗马尼亚语句子语料库,包含超过7亿条罗马尼亚语句子,用于支持遮蔽语言建模、持续预训练、语义文本相似度等自然语言处理任务。该数据集由公开文本数据集处理和拆分得到,保证了句子的质量和标准化。
A large-scale corpus of Romanian sentences with over 700 million entries, designed to support Masked Language Modeling, continual pretraining, semantic textual similarity, and other NLP tasks. The dataset is created by processing and splitting public text datasets, ensuring sentence quality and standardization.
提供机构:
BlackKakapo



