five

Annotated Reference Corpus of Scottish Gaelic (ARCOSG)

收藏
DataCite Commons2023-04-27 更新2025-04-17 收录
下载链接:
https://datashare.ed.ac.uk/handle/10283/2011
下载链接
链接失效反馈
官方服务:
资源简介:
A representative, tagged corpus of Scottish Gaelic, divided into 8 registers (4 spoken, 4 written) of approximately 10k words each. The corpus is presented as individual txt files. The corpus was hand-tagged by Lamb, Arbuthnot and Naismith and separately verified by them. It uses the Brown format tag separators ('/': e.g. 'agus/Cc') and an annotation scheme derived from the Irish PAROLE tagset (see Uí Dhonnchadha, E. and van Genabith, J. 2006. A Part-of-Speech tagger for Irish using finite state morphology and constraint grammar disambiguation. Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006), 2241-2244.). The annotation scheme is described in a PDF included with the data: Lamb, W. and Naismith, S (2014) Scottish Gaelic Part-of-Speech Annotation Guidelines. This work was funded by Bòrd na Gàidhlig and Carnegie Trust for the Universities of Scotland.
提供机构:
University of Edinburgh. School of Literatures, Languages and Cultures. Celtic and Scottish Studies
创建时间:
2016-05-25
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作