techiaith/bydtermcymru-tm-en-cy
收藏Hugging Face2025-04-07 更新2025-04-08 收录
下载链接:
https://hf-mirror.com/datasets/techiaith/bydtermcymru-tm-en-cy
下载链接
链接失效反馈官方服务:
资源简介:
BydTermCymru翻译记忆库数据集包含从BydTermCymru翻译记忆库中提取的英语-威尔士语句子对。该数据集经过文本预处理流程,包括句子切分、去重、归一化以及语言识别过滤,适用于语言模型训练、句子解析、语义分割、语义相似性分类和评分、情感分析和分类等多种自然语言处理任务。
The BydTermCymru Translation Memories dataset consists of English-Welsh sentence pairs extracted from BydTermCymru translation memories. The dataset has been processed through a text pre-processing pipeline including sentence splitting, deduplication, normalization, and LID filtering, and is suitable for a variety of natural language processing tasks such as language modeling, parsing, semantic segmentation, semantic similarity classification and scoring, sentiment analysis, and sentiment classification.
提供机构:
techiaith



