five

Map task corpus of heritage BCMS 1.0

收藏
SSH Open MarketPlace2025-04-02 更新2025-04-05 收录
下载链接:
https://marketplace.sshopencloud.eu/dataset/IU19im
下载链接
链接失效反馈
官方服务:
资源简介:
This corpus of heritage Bosnian/Croatian/Montenegrin/Serbian (BCMS) consists of elicited conversations (map tasks) by 29 second-generation BCMS speakers originating from different regions of former Yugoslavia and living in German-speaking Switzerland. The corpus is suited for researchers of heritage BCMS, as well as students and teachers of BCMS living in diaspora. The corpus contains 30 turn-aligned transcripts with an average length of 6 minutes. The texts are annotated with the [CLASSLA pipeline](https://github.com/clarinsi/classla) on the levels lemmatisation, MULTEXT-East Version 6 morphosyntactic descriptions, Universal Dependencies part-of-spech and morphological features. The corpus is enriched with corpus-specific annotations of truncations, elongations, stutter and code-switches. It is distributed in source TEI and derived vertical formats. The corpus is available for download from CLARIN.SI as well as through the noSketchEngine and KonText concordancers.
创建时间:
2025-04-02
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作