IGC-Laws-21.05 (The Icelandic Gigaword Corpus: Law, bills and proposals)
收藏SSH Open MarketPlace2025-07-04 更新2025-07-05 收录
下载链接:
https://marketplace.sshopencloud.eu/dataset/MXvjAM
下载链接
链接失效反馈官方服务:
资源简介:
IGC-Laws is a subcorpus of the [The Icelandic Gigaword Corpus](http://hdl.handle.net/20.500.12537/192) (see also [CLARIN reference corpora](https://www.clarin.eu/resource-families/reference-corpora)). IGC-Laws contains 1) the Icelandic laws, 2) explanatory reports and observations extracted from bills submitted to Althingi, and 3) parliamentary proposals and resolutions. The corpus comes in two formats. One contains the texts untokenized and untagged while the other has been tokenized, PoS-tagged and lemmatized.
The corpus is available for download from the CLARIN-IS repository.
IGC-Laws是[冰岛语千兆词语料库(The Icelandic Gigaword Corpus)](http://hdl.handle.net/20.500.12537/192)的子语料库,另可参考[CLARIN参考语料库(CLARIN reference corpora)](https://www.clarin.eu/resource-families/reference-corpora)。IGC-Laws涵盖三类内容:1)冰岛法律法规文本;2)提交至冰岛议会(Althingi)的法案释义报告与审议意见;3)议会提案及决议。
该语料库提供两种格式版本:其一为未分词、未标注的原始文本;其二为已完成分词、词性标注(Part-of-Speech tagging,简称PoS)及词形还原的处理版文本。
该语料库可从CLARIN-IS仓储库下载获取。
创建时间:
2025-07-04



