five

Austrian Baroque Corpus

收藏
SSH Open MarketPlace2025-08-26 更新2025-08-30 收录
下载链接:
https://marketplace.sshopencloud.eu/dataset/5eNsQ1
下载链接
链接失效反馈
官方服务:
资源简介:
This historical corpus contains sermons from 1650 to 1750. For linguistic annotation, each individual token was automatically assigned to a morphosyntactic word class using the [TreeTagger](https://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/) software. As a classification system, the 54-part Stuttgart-Tübingen TagSet ([STTS](https://homepage.ruhr-uni-bochum.de/Stephen.Berman/Korpuslinguistik/Tagsets-STTS.html)) was used. For lemmatization , a normalized basic word form was used for each token and the [Duden](http://www.duden.de/) and the [German dictionary by Jacob and Wilhelm Grimm](http://www.dwb.uni-trier.de/) were used as reference works. The part-of-speech tagging and lemmatization was then manually checked. The corpus is available through a dedicated concordancer.
创建时间:
2025-08-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作