five

Constructing and Cleaning Identity Graphs in the LOD Cloud

收藏
www.doi.org2025-03-25 收录
下载链接:
https://www.doi.org/10.11922/sciencedb.j00104.00056
下载链接
链接失效反馈
官方服务:
资源简介:
Five tables and six figures of the paper. Table 1 shows evaluation of 200 owl:sameAs links, with each 40 links randomly chosen from a certain range of error \r\ndegree. The percentages between parentheses are calculated without considering the links evaluated as “can’t tell”. Table 2 presents accuracy of the approach on the four manually evaluated samples, based on a threshold of 0.99. Links evaluated as “can’t tell” by the judges are discarded. Table 3 shows precision, recall and accuracy, based on two thresholds (0.99 and 0.4) for the Barack Obama identity set. Links evaluated as “can’t tell” by the judges are discarded.Table 4 is comparison of the original identity network closure, the closure (b) and (c), with the Gold Standard. Table 5 shows precision, recall and accuracy evaluation of the three closures. Figure 1 is the workflow of the identity network extraction, compaction and closure. Figure 2 depicts the distribution of identity set cardinality in Gim. The x-axis lists all 48,999,148 non-singleton identity sets. Figure 3 shows error degree distribution of all owl:sameAs statements in the LOD-a-lot. Figure 4 is comparison of the original identity network and its transitive closure, with the two newly constructed identity subgraphs.Figure 5 shows‘Barack Obama’ identity cluster. Figure 6 presents community structure of the‘Barack Obama’ identity cluster.

本文献包含五张表格和六幅图表。表格1展示了200个owl:sameAs链接的评估结果,其中每个表格从特定范围内的错误程度中随机选取了40个链接。括号中的百分比计算未考虑被评估为“无法判断”的链接。表格2展示了基于0.99阈值,在四个手动评估样本上采用的方法的准确性。被评判为“无法判断”的链接被排除在外。表格3展示了基于两个阈值(0.99和0.4)的Barack Obama身份集的精确度、召回率和准确性。被评判为“无法判断”的链接同样被排除。表格4比较了原始身份网络封闭、封闭(b)和(c)与黄金标准。表格5展示了三个封闭的精确度、召回率和准确性评估。图1展示了身份网络提取、压缩和封闭的工作流程。图2描绘了Gim中身份集数量的分布,横轴列出了所有48,999,148个非单例身份集。图3展示了LOD-a-lot中所有owl:sameAs语句的错误程度分布。图4展示了原始身份网络及其传递封闭与两个新构建的身份子图的比较。图5展示了‘Barack Obama’身份簇。图6展示了‘Barack Obama’身份簇的社区结构。
提供机构:
www.doi.org
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作