The Finnish Sub-corpus of the Newspaper and Periodical Corpus of the National Library of Finland version 2, Korp
收藏Mendeley Data2024-01-31 更新2024-06-29 收录
下载链接:
https://etsin.fairdata.fi/dataset/1d51a6cc-3843-4b4b-b286-3c4449e198d9
下载链接
链接失效反馈官方服务:
资源简介:
This resource is available via Korp in Kielipankki – the Language Bank of Finland. The corpus consists of Finnish newspapers and magazines starting from 1771 up to 2021, compiled by the National Library of Finland. For this new version, the data of the previous version (Finnish and Swedish) was checked with the HeLI-OTS language identifier. Parts of texts, which do not contain Finnish, were removed from this corpus. On the other hand, texts from the Swedish part of KLK, which contain Finnish, where added to this corpus. The new version consists of text elements, where at least one sentence element was identified as being in Finnish, from these three sources: - KLK-fi, version 1 (http://urn.fi/urn:nbn:fi:lb-2016050302) - KLK-sv, version 1 (http://urn.fi/urn:nbn:fi:lb-2016050301) - new data from the National Library (not previously available in the Language Bank, may cover any time period, just more recently OCR'd) The text elements are enriched with a 'version_added' attribute, which identifies the source. List of the newspapers and magazines that the klk-fi version 1 contains: https://www.kielipankki.fi/wp-content/uploads/klk-lehdet-fi.pdf List of the newspapers and magazines that the klk-fi version 2 additionally contains: https://www.kielipankki.fi/wp-content/uploads/finclarin_2019.pdf
创建时间:
2024-01-31



