UK Hansard 1935-2014
收藏NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://zenodo.org/record/7348818
下载链接
链接失效反馈官方服务:
资源简介:
UK Hansard 1935-2014
The "uk_hansard_1935_2014_BvW_2022.tsv" is a metadata enriched version of the Hansard corpus. It is one of the outputs from Betto van Waarden's Marie Skłodowska-Curie project "Presenting Parliament: Parliamentarians’ visions of the communication and role of parliament within the mediated democracies of Britain, Belgium and the Netherlands, 1844-1995 ".
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 897761.
The project has been executed in collaboration with Mathias Johansson from the DigitalHistory@Lund platform, Lund University
Base corpora
The attached file is an agglomeration of two preexisting versions of the UK Hansard corpus: Political Mashup, provided to us by Kaspar Beelen and a .tsv file we received from Ludovic Rheault which itself is based on the Political Mashup corpus.
This corpus is based on the Rheault version and is reproduced with his written permission. The original corpus contained the following columns:
cabinet
function
parliament
party
party_in_power
speaker_id
speech_id
speech_text
topic
year
To which we have added the following columns:
date
times_in_house
seniority
district
district_class
Changed the 'year' column to 'date' as it contained the ISO8061-formatted date, leaving only the year in the 'year' column
Career
In the Political Mashup data there is information about which offices each speaker has held and using this data we have calculated how many terms a speaker has held office at the time of a speech times_in_house. From this we derived a column we call seniority: speakers that have been in office at most one time before the current parliament/round is classified as a junior - everyone else as a senior.
Districts
By cross referencing the speaker_ids against the Political Mashup we have extracted which district each speaker was representing. We have then mapped these districts against a classification of UK districts (Baker, 2018, accessed on 2021-07-12) that uses six classes which we reduced to three classes for simplicity's sake:
Original
Reduced
Core City
Other City
city
Large Town
Medium Town
town
Small Town
Village or Smaller
village
Mapping districts to one of the three classes was mostly done automatically by matching district names against the list, and districts that have split or merged over time were processed manually. Still, not all districts were resolved satisfactorily leaving 43,541 speeches without a district classification resulting in a coverage of 98.7%.
class
count
town
1,253,330
city
1,184,605
village
908,606
N/A
43,541
创建时间:
2022-11-24



