Error analysis of surname rendering in Finnish-to-English machine translation
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/7936744
下载链接
链接失效反馈官方服务:
资源简介:
This surname list is a part of the dataset that I used in my master’s thesis “Error analysis of surname rendering in Finnish-to-English machine translation”. My master thesis is deposited at the Helsinki University Library.
The surname list was extracted from the The Finnish News Agency Archive 2019–2021 (stt-fi-2019-2021-src). Permission to access the corpus can be obtained through the Language Bank of Finland. Corpus cannot be shared with third parties even if permission is granted.
I cannot share the full dataset, as it contains sentences from the corpus.
I am sharing only the list of surnames that I analyse in my master's thesis. The list contains 4,000 surnames extracted from the corpus. The list also contains the identification numbers of news articles from which the surnames were extracted and other metadata. Anyone with the permission to use the The Finnish News Agency Archive 2019–2021 corpus can use the id number to identify the news articles and recreate the dataset.
Reference:
STT. (2022). Finnish News Agency Archive 2019-2021, source [text corpus]. Kielipankki. Retrieved March 10, 2023, from http://urn.fi/urn:nbn:fi:lb-2022030202
创建时间:
2023-05-18



