jangel97/en-es-tatoeba
收藏Hugging Face2025-11-25 更新2025-11-30 收录
下载链接:
https://hf-mirror.com/datasets/jangel97/en-es-tatoeba
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含276,265对西班牙语和英语的平行句子,用于机器翻译和序列到序列微调的实验。句子来源于简短的对话环境,代表日常的非正式语言。此版本的数据集经过了短句长度过滤(3-15个单词)、去重和TSV格式化处理。
This dataset contains 276,265 parallel sentence pairs in Spanish ↔ English, intended for experiments in machine translation and sequence-to-sequence fine-tuning. Sentences come from short conversational contexts and represent everyday informal language. This version includes filtering for short sentence length (3–15 words), deduplication, and TSV formatting.
提供机构:
jangel97



