five

techiaith/bydtermcymru-tm-en-cy

收藏
Hugging Face2025-04-07 更新2025-04-08 收录
下载链接:
https://hf-mirror.com/datasets/techiaith/bydtermcymru-tm-en-cy
下载链接
链接失效反馈
官方服务:
资源简介:
BydTermCymru翻译记忆库数据集包含从BydTermCymru翻译记忆库中提取的英语-威尔士语句子对。该数据集经过文本预处理流程,包括句子切分、去重、归一化以及语言识别过滤,适用于语言模型训练、句子解析、语义分割、语义相似性分类和评分、情感分析和分类等多种自然语言处理任务。

The BydTermCymru Translation Memories dataset consists of English-Welsh sentence pairs extracted from BydTermCymru translation memories. The dataset has been processed through a text pre-processing pipeline including sentence splitting, deduplication, normalization, and LID filtering, and is suitable for a variety of natural language processing tasks such as language modeling, parsing, semantic segmentation, semantic similarity classification and scoring, sentiment analysis, and sentiment classification.
提供机构:
techiaith
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作