gplsi/ES-VA_translation_test
收藏Hugging Face2025-12-19 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/gplsi/ES-VA_translation_test
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是从Common Voice工具中提取的20万条句子构建而成,经过严格过滤,选择具有最高语言丰富性的句子,以确保其在需要语言多样性和复杂性的应用中的实用性。随后,这些选定的句子由加泰罗尼亚语言学专家从西班牙语翻译成加泰罗尼亚语和瓦伦西亚语,确保翻译的语言质量和文化准确性。该数据集包含1,960个示例,每个示例有唯一的ID、西班牙语原文和瓦伦西亚语翻译。数据集旨在促进西班牙语和瓦伦西亚语之间的机器翻译研究,支持多语言自然语言处理研究,并促进这些语言对的翻译系统开发。
This dataset was built from 200,000 sentences extracted from the Common Voice tool, subjected to rigorous filtering to select those with the greatest linguistic richness. The selected sentences were translated from Spanish to Catalan and Valencian by an expert in Catalan philology, ensuring linguistic quality and cultural accuracy. The dataset includes 1,960 examples with unique IDs, original Spanish phrases, and corresponding Valencian translations. It aims to promote Machine Translation between Spanish and Valencian, supporting multilingual NLP research and facilitating the development of translation systems for these language pairs.
提供机构:
gplsi
原始信息汇总
数据集概述
许可证信息
- 许可证类型: CC-BY-4.0



