five

DWUG DE: Diachronic Word Usage Graphs for German

收藏
Zenodo2025-04-23 更新2026-05-25 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.7386402
下载链接
链接失效反馈
官方服务:
资源简介:
This data collection contains diachronic Word Usage Graphs (WUGs) for German. Find a description of the data format, code to process the data and further datasets on the WUGsite. We provide additional data under misc/: dwug_de_sense: a subset of DWUG DE was annotated with classical word sense definitions (DWUG DE Sense, see data/*/judgments_senses.csv). This folder provides clusterings and change scores for this subset. The clusters and statistics under maj_2 and maj_3 are derived from the sense annotation by removing instances where not at least 2/3 annotators agree on the label. Note that the scores EARLIER, LATER and COMPARE are not calculated from human semantic proximity judgments (as for other WUG data sets), but from binary labels ('0' for different sense, '1' for same sense) derived from the sense annotation. Please find the code deriving the clusters and change scores in the WUG repository. The data set is described in more detail in Schlechtweg (2022). See previous versions for additional testsets. Please find more information on the provided data in the paper referenced below. Version: 2.2.0, 30.11.2022. Contains additional clusterings and change scores derived from DWUG DE Sense. Important: Version 2.0.0 extends previous versions with one more annotation round and new clusterings. Reference Dominik Schlechtweg, Nina Tahmasebi, Simon Hengchen, Haim Dubossarsky, Barbara McGillivray. 2021. DWUG: A large Resource of Diachronic Word Usage Graphs in Four Languages. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Dominik Schlechtweg. 2022. Human and Computational Measurement of Lexical Semantic Change. PhD thesis. University of Stuttgart.
提供机构:
Zenodo
创建时间:
2022-12-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作