five

CROCorp: Corpus of Parliamentary Debates in Croatia

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/6521372
下载链接
链接失效反馈
官方服务:
资源简介:
The repository contains a cleaned and pre-processed corpus of parliamentary debates from the Croatian Parliament (Sabor). The corpus is accompanied by the metadata on elected representatives and their political parties. It covers the period of 2003-2020 (five complete terms) and counts over 500 thousand speeches. If you use the dataset, please cite: Mochtak, Michal, Josip Glaurdić, and Christophe Lesschaeve (2022): CROCorp: Corpus of Parliamentary Debates in Croatia (v1.1.1), https://doi.org/10.5281/zenodo.6521372. v1.1.1 (latest version) - added the concept DOI to codebooks (DOI was generated only after the repository was published) v1.1.0 - improved coding of dummy variable "moderator" (using less error-prone alghoritm for detecting the modertor role) - fixed issue with agenda points which are conncatenated while preserving a unique web link - recoded agenda points tags using better ML model (transformer architecture) v1.0.0 - originally posted on GESIS repository (migrated to ZENODO due to limitations concerning the concept DOI)
创建时间:
2024-07-16
二维码
社区交流群
二维码
科研交流群
商业服务