Data from: Using network approaches to enhance the analysis of cross-linguistic polysemies

DataONE2013-04-26 更新2024-06-27 收录

下载链接：

https://search.dataone.org/view/null

下载链接

链接失效反馈

官方服务：

资源简介：

Since long it has been noted that cross-linguistically recurring polysemies can serve as an indicator of conceptual relations, and quite a few approaches to model and analyze such data have been proposed in the recent past. Although – given the nature of the data – it seems natural to model and analyze it with the help of network techniques, there are only a few approaches which make explicit use of them. In this paper, we show how the strict application of weighted network models helps to get more out of cross-linguistic polysemies than would be possible using approaches that are only based on item-to-item comparison. For our study we use a large dataset consisting of 1252 semantic items translated into 195 different languages covering 44 different language families. By analyzing the community structure of the network reconstructed from the data, we find that a majority of the concepts (68%) can be separated into 104 large communities consisting of five and more nodes. These large communities almost exclusively constitute meaningful groupings of concepts into conceptual fields. They provide a valid starting point for deeper analyses of various topics in historical semantics, such as cognate detection, etymological analysis, and semantic reconstruction.

长期以来，学界已注意到跨语言中反复出现的多义现象（polysemy）可作为概念关联的指示标志，近年亦有诸多针对此类数据的建模与分析方法被提出。尽管鉴于此类数据的特性，借助网络技术进行建模与分析看似合乎情理，但明确采用此类技术的相关方法仍寥寥无几。本文将阐述，相较于仅基于项间比对的分析方法，严格应用加权网络模型（weighted network model）能从跨语言多义现象中挖掘出更多信息。本研究使用的大型数据集包含1252个语义项（semantic item），被译为195种不同语言，涵盖44个语系。通过分析由该数据集重构得到的网络的社群结构，我们发现绝大多数概念（占比68%）可被划分为104个大型社群，每个社群至少包含5个节点。这些大型社群几乎全部可将概念归为具有语义关联的概念域，为历史语义学领域的多项深层研究提供了可靠起点，例如同源词检测、词源分析以及语义重构。

创建时间：

2013-04-26

5,000+

优质数据集

54 个

任务类型

进入经典数据集