fredxlpy/LuxAlign_v1
收藏Hugging Face2024-12-17 更新2024-12-21 收录
下载链接:
https://hf-mirror.com/datasets/fredxlpy/LuxAlign_v1
下载链接
链接失效反馈官方服务:
资源简介:
LuxAlign是一个包含卢森堡语-英语和卢森堡语-法语句子对的平行数据集,旨在通过对齐卢森堡语的嵌入空间与其他语言的嵌入空间,来改进卢森堡语的跨语言句子表示。该数据集来源于卢森堡新闻平台RTL.lu发布的新闻文章。数据集中的句子对并不总是精确的翻译,而是反映了高度的语义相似性,因此在使用此数据集训练机器翻译模型时需要谨慎。
LuxAlign is a parallel dataset featuring Luxembourgish-English and Luxembourgish-French sentence pairs, sourced from news articles published by the Luxembourgish news platform RTL.lu. The dataset is designed to align the Luxembourgish embedding space with those of other languages, thereby improving cross-lingual sentence representations. It is important to note that the sentence pairs are not always exact translations but instead reflect high semantic similarity, making caution necessary when using this dataset for training machine translation models.
提供机构:
fredxlpy



