PaMuCo: Parallel Multilingual Corpus: Polish, German and Danish parallel langual corpus
收藏SSH Open MarketPlace2025-01-30 更新2025-02-01 收录
下载链接:
https://marketplace.sshopencloud.eu/dataset/rE09H6
下载链接
链接失效反馈官方服务:
资源简介:
Parallel Multilingual Corpus
The PaMuCo resource offers annotated segments of film dialogues, fiction literature (novels and fairy tales) and EU legal texts in Polish, German and Danish, all annotated by a team of neophilologists, linked by translation relationships. The selection of texts is based on the principle of genre diversity and representative content that is socially, historically and culturally relevant internationally. The texts are placed in the MANTEL portal, which allows the user to assign linguistic equivalence relations to the texts. These are relations such as simple, merge / split, deletion, crossing, composition, compression, general paraphrase, reduction, expansion, substitution, disjunction. Corpus with translational relations are then uploaded to the PaMuCo portal, which is a search engine for these relations. It is based on existing or new corpora and allows to search comprehensively for segments and relations in corpora and make comparisons.
平行多语言语料库(Parallel Multilingual Corpus)
PaMuCo资源收录了波兰语、德语与丹麦语的标注语段,涵盖电影台词、虚构文学作品(含小说与童话)及欧盟法律文本,所有文本均由新语文学学者团队完成标注,且各语言文本间均建立了翻译关联关系。
文本遴选遵循体裁多样性原则,所选内容均为在国际层面具备社会、历史与文化代表性的内容。
上述文本被部署于MANTEL门户平台,该平台支持用户为文本标注语言等效关系,涵盖简单对应、合并/拆分、删除、交叉对应、合成对应、压缩对应、通用意译、简化、扩展、替换、分离对应等多种关联类型。
带有翻译关联关系的语料库随后被上传至PaMuCo门户平台,该平台是专为此类关联关系打造的搜索引擎。平台依托现有或新建语料库,可对语料库中的语段及关联关系进行全面检索与对比分析。
创建时间:
2025-01-30
搜集汇总
数据集介绍

以上内容由遇见数据集搜集并总结生成



