five

Data and models for automatic scansion experiment Dutch Song Database

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://zenodo.org/record/3243661
下载链接
链接失效反馈
官方服务:
资源简介:
This release contains the data used in an experiment on automatic scansion for historical Dutch song texts. Aside form the data, two models are included in this release as well. One model is essential for running the code that is part of this experiment (model_s); while the other model is an example of an acquired automatic scansion model (best_model). Item descriptions: meertens-meter-songs.zip → collection of 23,197 historic Dutch songs (xml-format). These files (and the gathered meta-data) stems from a collaboration project between the Dutch Song Database and the Digital Library for Dutch Literature. All files contain meta-data on the number of beats that is present in individual verse lines. Snippet: Een Meysken op een Rivierken sadt, So schoon zy was, Sy sadt en verbeyde haer soete Lief, Int groene gras.     model_s → model used for syllabification and assignment of lexical stress of (historic) Dutch words. The development of this model was part of a previous project.   stress_xml.zip → collection of 23,197 historic Dutch songs (xml-format). These are the same songs a the meertens-meter-songs, yet now their individual words are syllabified and annotated for lexical stress. The songs in this folder are used as input during the training process. Snippet: een meys ken op een ri vier ken sadt   gold_scan.zip → 198 Dutch song files (xml-format). These files have been annotated by an expert for line stress.   eval_splits.zip → contains the splits made from gold_scan. These are the splits used in the automatic scansion experiment: a development set of 98 songs (used during training), and a test set of 99 songs (used for evaluating the best model after training).   best_model.zip → contains the files of an acquired model for automatic Dutch song scansion.
创建时间:
2020-01-24
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作