The CLASSLA-Stanza model for morphosyntactic annotation of non-standard Serbian 2.1

SSH Open MarketPlace2025-07-04 更新2025-07-05 收录

下载链接：

https://marketplace.sshopencloud.eu/dataset/QJj0Xl

下载链接

链接失效反馈

官方服务：

资源简介：

The model for morphosyntactic annotation of non-standard Serbian was built with the [CLASSLA-Stanza tool](https://github.com/clarinsi/classla) by training on the [SETimes.SR training corpus](http://hdl.handle.net/11356/1200) combined with the [Serbian non-standard training corpus ReLDI-NormTagNER-sr](http://hdl.handle.net/11356/1794) and the [hr500k training corpus](http://hdl.handle.net/11356/1792) and using the [CLARIN.SI-embed.sr word embeddings](http://hdl.handle.net/11356/1789). These corpora were additionally augmented for handling missing diacritics by repeating parts of the corpora with diacritics removed. The model produces simultaneously UPOS, FEATS and XPOS (MULTEXT-East) labels. The estimated F1 of the XPOS annotations is ~92.64. The model is available for download from the CLARIN.SI repository.

创建时间：

2025-07-04

5,000+

优质数据集

54 个

任务类型

进入经典数据集