Ge’ez Language Homophonic Words Sense Disambiguation (WSD) Dataset : Contextual Word Disambiguates of Ge’ez Language with Homophonic Using Machine Learning
收藏Mendeley Data2026-05-21 收录
下载链接:
https://data.mendeley.com/datasets/3m878pzf7j
下载链接
链接失效反馈官方服务:
资源简介:
This dataset has three columns: the first column is named as text, the second column is named as class and the third column is named as homophonic word. The text column contains 1010 text samples for 10 pairs of homophonic Ge'ez words: ነስሐ and ነስኀ, ሐየሰ and ኀየሰ, ጸመመ and ፀመመ, ቀሰመ and ቀሠመ, ሐደመ and ሀደመ, መሀረ and መሐረ, ኀለየ and ሐለየ, መልአ and መልዐ, and ፈጸመ and ፈፀመ, ሠርሐ and ሰርሐ. The sample is a sentence that contains homophonic words. The class column contained the contextual meaning(sense) of the homophonic word in the given sample. The homophonic column contains the identified homophonic word in the given sample. The contextual meaning of words is determined by based on Akalewold Kiflie dictionary
本数据集共设三列:第一列命名为`text`,第二列命名为`class`,第三列命名为`homophonic word`。`text`列包含针对10对吉兹语(Ge'ez)同音词的1010条文本样本,该10对同音词依次为:ነስሐ与ነስኀ、ሐየሰ与ኀየሰ、ጸመመ与ፀመመ、ቀሰመ与ቀሠመ、ሐደመ与ሀደመ、መሀረ与መሐረ、ኀለየ与ሐለየ、መልአ与መልዐ、ፈጸመ与ፈፀመ,以及ሠርሐ与ሰርሐ。每条样本均为包含对应同音词的语句。`class`列存储对应样本中同音词的上下文语义(义项)。`homophonic word`列存储对应样本中已识别出的同音词。本数据集的词语上下文语义均基于Akalewold Kiflie词典进行标注。
创建时间:
2024-03-13



