CROCorp: Corpus of Parliamentary Debates in Croatia
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/6521372
下载链接
链接失效反馈官方服务:
资源简介:
The repository contains a cleaned and pre-processed corpus of parliamentary debates from the Croatian Parliament (Sabor). The corpus is accompanied by the metadata on elected representatives and their political parties. It covers the period of 2003-2020 (five complete terms) and counts over 500 thousand speeches.
If you use the dataset, please cite: Mochtak, Michal, Josip Glaurdić, and Christophe Lesschaeve (2022): CROCorp: Corpus of Parliamentary Debates in Croatia (v1.1.1), https://doi.org/10.5281/zenodo.6521372.
v1.1.1 (latest version)
- added the concept DOI to codebooks (DOI was generated only after the repository was published)
v1.1.0
- improved coding of dummy variable "moderator" (using less error-prone alghoritm for detecting the modertor role)
- fixed issue with agenda points which are conncatenated while preserving a unique web link
- recoded agenda points tags using better ML model (transformer architecture)
v1.0.0
- originally posted on GESIS repository (migrated to ZENODO due to limitations concerning the concept DOI)
创建时间:
2024-07-16



