HELFI: 希伯来语-希腊语-芬兰语平行圣经语料库与跨语言词素对齐
收藏arXiv2020-03-17 更新2024-06-21 收录
下载链接:
https://github.com/amikael/HELFI
下载链接
链接失效反馈官方服务:
资源简介:
HELFI数据集是由赫尔辛基大学数字人文系创建的一个包含希伯来语、希腊语和芬兰语的平行圣经语料库,旨在提供精细的跨语言词素对齐。该数据集包含39本希伯来语书籍和27本希腊语书籍的芬兰语翻译,总计66本书。数据集的创建过程涉及从原始数据库的构建到使用自由文本版本和注释的重建,确保了数据的开源可用性。HELFI数据集的应用领域广泛,包括语言学、神学、翻译研究和语言工程,旨在解决圣经翻译中的精细对齐问题,支持自动对齐算法的评估和开发。
The HELFI Dataset is a parallel biblical corpus encompassing Hebrew, Greek and Finnish languages, developed by the Department of Digital Humanities at the University of Helsinki, with the primary objective of delivering fine-grained cross-linguistic morpheme alignment. It includes Finnish translations of 39 Hebrew biblical books and 27 Greek biblical books, totaling 66 books. The development workflow of the dataset spans from the construction of the original database to the reconstruction using free-text versions and annotations, ensuring the open-source availability of the corpus. The HELFI Dataset finds broad applications across disciplines including linguistics, theology, translation studies and language engineering. It is designed to address the fine-grained alignment challenges in biblical translation, and supports the evaluation and development of automated alignment algorithms.
提供机构:
赫尔辛基大学数字人文系
创建时间:
2020-03-17



