five

UK Hansard 1935-2014

收藏
NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://zenodo.org/record/7348818
下载链接
链接失效反馈
官方服务:
资源简介:
UK Hansard 1935-2014 The "uk_hansard_1935_2014_BvW_2022.tsv" is a metadata enriched version of the Hansard corpus. It is one of the outputs from Betto van Waarden's Marie Skłodowska-Curie project "Presenting Parliament: Parliamentarians’ visions of the communication and role of parliament within the mediated democracies of Britain, Belgium and the Netherlands, 1844-1995 ". This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 897761. The project has been executed in collaboration with Mathias Johansson from the DigitalHistory@Lund platform, Lund University Base corpora The attached file is an agglomeration of two preexisting versions of the UK Hansard corpus: Political Mashup, provided to us by Kaspar Beelen and a .tsv file we received from Ludovic Rheault which itself is based on the Political Mashup corpus. This corpus is based on the Rheault version and is reproduced with his written permission. The original corpus contained the following columns: cabinet function parliament party party_in_power speaker_id speech_id speech_text topic year To which we have added the following columns: date times_in_house seniority district district_class Changed the 'year' column to 'date' as it contained the ISO8061-formatted date, leaving only the year in the 'year' column Career In the Political Mashup data there is information about which offices each speaker has held and using this data we have calculated how many terms a speaker has held office at the time of a speech times_in_house. From this we derived a column we call seniority: speakers that have been in office at most one time before the current parliament/round is classified as a junior - everyone else as a senior. Districts By cross referencing the speaker_ids against the Political Mashup we have extracted which district each speaker was representing. We have then mapped these districts against a classification of UK districts (Baker, 2018, accessed on 2021-07-12) that uses six classes which we reduced to three classes for simplicity's sake: Original Reduced Core City Other City city Large Town Medium Town town Small Town Village or Smaller village Mapping districts to one of the three classes was mostly done automatically by matching district names against the list, and districts that have split or merged over time were processed manually. Still, not all districts were resolved satisfactorily leaving 43,541 speeches without a district classification resulting in a coverage of 98.7%. class count town 1,253,330 city 1,184,605 village 908,606 N/A 43,541
创建时间:
2022-11-24
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作