five

Error analysis of surname rendering in Finnish-to-English machine translation

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/7936744
下载链接
链接失效反馈
官方服务:
资源简介:
This surname list is a part of the dataset that I used in my master’s thesis “Error analysis of surname rendering in Finnish-to-English machine translation”. My master thesis is deposited at the Helsinki University Library. The surname list was extracted from the The Finnish News Agency Archive 2019–2021 (stt-fi-2019-2021-src). Permission to access the corpus can be obtained through the Language Bank of Finland. Corpus cannot be shared with third parties even if permission is granted. I cannot share the full dataset, as it contains sentences from the corpus. I am sharing only the list of surnames that I analyse in my master's thesis. The list contains 4,000 surnames extracted from the corpus. The list also contains the identification numbers of news articles from which the surnames were extracted and other metadata. Anyone with the permission to use the The Finnish News Agency Archive 2019–2021 corpus can use the id number to identify the news articles and recreate the dataset. Reference: STT. (2022). Finnish News Agency Archive 2019-2021, source [text corpus]. Kielipankki. Retrieved March 10, 2023, from http://urn.fi/urn:nbn:fi:lb-2022030202
创建时间:
2023-05-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作