hannahsteinbach/bundestag-20
收藏Hugging Face2026-03-26 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/hannahsteinbach/bundestag-20
下载链接
链接失效反馈官方服务:
资源简介:
# Bundestag 20th Legislative Period Dataset
This dataset contains structured information about **verbal and nonverbal interjections** in the German Bundestag during the **20th legislative period**. It was compiled as part of a Master’s thesis analyzing how members of the **Traffic Light Coalition** (SPD, FDP, Greens) express alignment and conflict through interjections in parliamentary debates.
It includes **tokenized speech paragraphs, interjection types, speaker metadata, and contextual information**, enabling research on parliamentary discourse, alignment, and conflict patterns.
---
## Dataset Structure
* **File format:** CSV (`all_output_20.csv`)
* **Columns:**
| Category | Columns | Description |
| ------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **General Debate** | `Filename`, `Period`, `Date`, `Speech #`, `Paragraph #` | Identifiers for each speech and paragraph, including file, legislative period, debate date, speech number, and paragraph ID. |
| **Speaker** | `Speaker`, `Role`, `Gender`, `Party` | Metadata about the person delivering the speech: name, role (optional), gender, and party. |
| **Paragraph** | `Paragraph`, `Paragraph tokens`, `paragraph_token_count`, `Quote` | The speech paragraph text, tokenized version, token count, and whether it is a quotation. |
| **Interjection** | `Interjection`, `Interjection Text`, `Verbal interjection`, `Nonverbal interjection`, `Interjection type`, `Interjection tokens`, `interjection_token_count`, `Directed at (Person)`, `Directed at (Party)` | Information about interjections, including type, verbal/nonverbal distinction, text, tokenized text, token count, and target (person or party). Empty for non-interjections. |
| **Interjector** | `Interjector`, `Interjector Gender`, `Interjector Party` | Metadata about who is reacting to a speech. Defaults to `"unknown"` if not identified. For interjections by multiple members (e.g., applause), `Interjector` can be `"all"` or `"some"`. Empty for non-interjections. |
| **Context** | `Agenda Item`, `Context`, `Supplementary Context`, `Previous Paragraphs`, `Previous Interjections` | Surrounding context for each paragraph or interjection. Includes agenda item, supplementary context, up to two previous paragraphs by the speaker, and prior interjections with type, text, and party info. |
---
## Dataset Size
* **Legislative Period:** 20th (Bundestag)
* **Total rows:** 721900
* **Scope:** All speeches and interjections during the 20th legislative period (opening and procedural remarks at the start of each session were excluded)
---
## Usage
Load in Python:
```python
import pandas as pd
df = pd.read_csv("all_output_20.csv")
```
Example: Filter verbal interjections from SPD members (interjector party) towards FDP members (speaker party):
```python
spd_fdp_interjections = df[(df["Interjector Party"] == "SPD") & (df["Party"] == "FDP") & (df["Verbal interjection"] == True)]
```
---
## Citation
If you use this dataset, please cite:
**Text version:**
> Steinbach, Hannah (2026). *You Are Coalition Partners! — Alignment and Conflict in the Traffic Light Coalition Through Interjections and Policy Topics*. 20th Bundestag Dataset. HuggingFace Dataset. CC-BY-NC-4.0. [https://huggingface.co/datasets/hannahsteinbach/bundestag-20](https://huggingface.co/datasets/hannahsteinbach/bundestag-20)
**BibTeX version:**
```bibtex
@misc{steinbach2026coalition,
author = {Hannah Steinbach},
title = {You Are Coalition Partners! — Alignment and Conflict in the Traffic Light Coalition Through Interjections and Policy Topics},
year = {2026},
howpublished = {HuggingFace Dataset},
note = {20th Bundestag Dataset, CC-BY-NC-4.0, \url{https://huggingface.co/datasets/hannahsteinbach/bundestag-20}}
}
```
---
## Note on Usage
This dataset is based on files from [Bundestag Open Data](https://www.bundestag.de/services/opendata) and publicly available under **CC-BY-NC-4.0** for **non-commercial research and educational purposes**.
If you plan to use the dataset, please **contact the author** at **[hannahsteinbach0312@gmail.com](mailto:hannahsteinbach0312@gmail.com)** for questions or clarifications.
提供机构:
hannahsteinbach



