A Greek Parliament Proceedings Dataset for Computational Linguistics and Political Analysis

Name: A Greek Parliament Proceedings Dataset for Computational Linguistics and Political Analysis
Creator: OpenDataLab
Published: 2026-05-31 12:30:37
License: 暂无描述

OpenDataLab2026-05-31 更新2024-05-09 收录

下载链接：

https://opendatalab.org.cn/OpenDataLab/A_Greek_Parliament_Proceedings_Dataset_etc

下载链接

链接失效反馈

官方服务：

资源简介：

很难找到大型的历时性政治话语数据集，尤其是对于资源贫乏的语言，例如g。在本文中，我们介绍了希腊议会程序的精选数据集，该数据集按时间顺序1989年2020年扩展。它由从5,355的议会记录文件中提取的100万多个具有广泛元数据的演讲组成。我们解释了它是如何构建的，以及我们必须克服的挑战。该数据集可用于计算语言学和政治分析-理想情况下，将两者结合起来。我们提出了这样一个应用程序，显示了 (i) 如何使用数据集来研究单词使用随时间的变化，(ii) 重大历史事件和政党之间的变化，(iii) 通过评估和使用用于检测语义转移的算法。

Large diachronic political discourse datasets are scarce, especially for low-resource languages such as Greek. In this paper, we introduce a curated dataset of Greek parliamentary proceedings, spanning the period from 1989 to 2020 in chronological order. It comprises over 1 million speeches with comprehensive metadata extracted from 5,355 parliamentary record files. We elaborate on its construction process and the challenges we encountered. This dataset can be utilized in computational linguistics and political analysis, ideally combining the two fields. We present a corresponding application that demonstrates: (i) how to use the dataset to investigate temporal changes in word usage, (ii) shifts associated with major historical events and political parties, and (iii) semantic shift detection through evaluating and applying relevant algorithms.

提供机构：

OpenDataLab

创建时间：

2022-11-18

搜集汇总

数据集介绍