five

WikiMorph: Learning to Decompose Words into Morphological Structures

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/5172856
下载链接
链接失效反馈
官方服务:
资源简介:
WikiMorph is a JSON dataset that contains word breakdowns for English words. These word breakdowns primarily consist of morphological compounds (both from English and the word's etymology) along with each compound's associated definition. It also contains other fields that might be useful, such as syllables and parts-of-speech tags. The dataset contains entries for 355,782 unique words and 505,033 total entries. The data collection process for this dataset was described in the paper "WikiMorph: Learning to Decompose Words into Morphological Structures", with some additional updates after publication.   { "Word": "abduction", "PoS": "Noun", "Syllables": [ "ab", "duc", "tion" ], "Definition": "The act of abducing or abducting; a drawing apart; the movement which separates a limb or other part from the axis, or middle line, of the body.", "Morphemes": [ { "Affix": "abduct", "Language": "en", "PoS": "Verb", "Meaning": "To draw away, as a limb or other part, from the median axis of the body.", "Etymology Compounds": [ { "Affix": "ab", "Language": "la", "Decoded": "ab", "PoS": null, "Meaning": "away" }, { "Affix": "duco", "Language": "la", "Decoded": "duco", "PoS": null, "Meaning": "to lead" } ] }, { "Affix": "-ion", "Language": "en", "PoS": "Suffix", "Meaning": "an action or process, or the result of an action or process", "Etymology Compounds": [ { "Affix": "-iō", "Language": "la", "Decoded": "-io", "PoS": "Suffix", "Meaning": "Used to form abstract nouns from verbs." } ] } ] }
创建时间:
2021-08-10
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作