FrancophonIA/International_Judicial_Cooperation_Civil_Matters
收藏Hugging Face2025-03-30 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/FrancophonIA/International_Judicial_Cooperation_Civil_Matters
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是在欧洲语言资源协调(ELRC)项目框架内创建的,包含希腊语、英语、法语三种语言的混合文件,这些文件与1980年海牙公约、1980年卢森堡欧洲公约以及欧盟法规EC 2201/2003和EC 1393/2007有关。文件类型包括请求、证明、摘要、通知等。数据集经过了文本提取、文档和句子级别的对齐处理,并生成了每种语言对的TMX文件。在后期处理中,应用了过滤器以优化对齐,以便用于机器翻译系统的训练。数据集包含288个英语-希腊语翻译单位(TUs),287个英语-法语TUs和358个法语-希腊语TUs。
This dataset has been created as part of the European Language Resource Coordination (ELRC) project and includes trilingual documents in Greek, English, and French. These documents are related to the 1980 Hague Convention, the 1980 Luxembourg European Convention, and relevant EU regulations. The document types include requests, certificates, summaries, and notices. The dataset has undergone text extraction, alignment at the document and sentence levels, and the generation of TMX files for each language pair. Post-processing involved applying filters to optimize alignments for machine translation system training. The dataset consists of 288 English-Greek translation units (TUs), 287 English-French TUs, and 358 French-Greek TUs.
提供机构:
FrancophonIA



