ECPC Corpus (European Comparable and Parallel Corpora of Parliamentary Speeches Archive) – set 1
收藏DataCite Commons2022-06-01 更新2025-04-15 收录
下载链接:
https://live.european-language-grid.eu/catalogue/corpus/906
下载链接
链接失效反馈官方服务:
资源简介:
The European Comparable and Parallel Corpora of<p> Parliamentary Speeches Archive (ECPC), compiled at the Universitat Jaume I (Spain),<p> is a collection of XML metatextually tagged corpora containing speeches from three<p> European chambers (the European Parliament, the British House of Commons, and the<p> Spanish Congreso de los Diputados). It is a bilingual, bidirectional written corpus<p> in English and Spanish described by Zanettin (2012). This first set (ECPC_EP-05)<p> consists of (1) a "clean" version in XML of European Parliament's 2005 daily<p> sessions; (2) a POS-tagged version of the 2005 daily sessions; and (3) a<p> sentence-based aligned version of 2005 daily sessions. In its raw format, ECPC_EP-05<p> contains 3,668,476 tokens/words (excluding tagging) in English distributed over 60<p> utf-8 files and 3,993,867 tokens/words (excluding tagging) in Spanish distributed<p> over 60 utf-8 files. ECPC_EP-05 by MARÍA CALZADA PÉREZ (as coordinator of the ECPC<p> Research Group, Universitat Jaume I, Spain) is licensed under a Creative Commons<p> Attribution-NonCommercial-ShareAlike 4.0 International License (CC-BY-NC-SA 4.0:<p> http://creativecommons.org/licenses/by-nc-sa/4.0). All corpora in the ECPC Archive<p> have been funded by: Universitat Jaume I (UJI-B2017-25 P1·1B2012-64); Generalitat<p> Valenciana (AICO/2017/082): Ministerio de Educación, Cultura y Deporte<p> (FFI2008-01610/FILO; HUM2005-03756/FILO).
提供机构:
ELG
创建时间:
2022-06-01



