QAW: A Quality Assurance Workflow for Ontologies based on Detecting Semantic Regularities - Dataset on SNOMED

Name: QAW: A Quality Assurance Workflow for Ontologies based on Detecting Semantic Regularities - Dataset on SNOMED
Creator: figshare
Published: 2020-09-05 02:00:56
License: 暂无描述

DataCite Commons2020-09-05 更新2024-07-25 收录

下载链接：

https://figshare.com/articles/dataset/QAW_A_Quality_Assurance_Workflow_for_Ontologies_based_on_Detecting_Semantic_Regularities/701284/8

下载链接

链接失效反馈

官方服务：

资源简介：

This page contains supplementary material for the ISWC 2013 submission with title: "QAW: A Quality Assurance Workflow for Ontologies based on Detecting Semantic Regularities" Eleni Mikroyannidi, Manuel Quesada-Mart ́ınez, Dmitry Tsarkov, Jesualdo Tomas Fernandez Breis, Robert Stevens, Ignazio Palmisano. The detection of regularities was done with the Regularity Inspector for Ontologies framework. The project is open source and can be downloaded from the Links below. The fileset contains the data for the qualitative and quantititative analysis that were presented in the paper. In the qualitative analysis, six lexical patterns (keywords) that were processed. These are: "chronic","acute", "absent", "present", "right", "left". For these ones the reader can browse and download the following data: 1. XML Files with the generic name "keyword"_syntactic_usage.xml which contains the detected syntactic regularities for the referencing asserted axioms of the entities that contain the corresponding keyword in their label. There should be 6 files in total (for each keyword). 2. XML Files with the generic name "keyword"_semantic_usage.xml which contains the detected syntactic regularities for the referencing asserted axioms of the entities that contain the keyword in their label. 3. Text files with the generic name "keyword"_syntactic_usage_readable.txt, which contains a more readable format with label rendering of the syntactic regularities. 3. Text files with the generic name "keyword"_semantic_usage_readable.txt, which contains a more readable format with label rendering of the semantic regularities. In the quantitative analysis, 308 lexical patterns were processed, and corresponding syntactic and semantic regularities were detected. The dataset that is available for the reader contains the following: 1. LexAnal_Snomed_2013_NoSensitiveAnalysis_Cov_0.1_100.0.xml, which contains all lexical patterns that could be detected in the SNOMED-CT version January 2013. 2. Snomed_2013_LexAnal_Full_0.1-0.4Perc_.xml, which contains all lexical patterns with 0.1%-0.4% lexical pattern threshold. 3. syntactic_regularities_dataset.zip which contains 308 xml files with the syntactic regularities that were generated by RIO. 4. semantic_regularities_dataset.zip which contains 308 xml files with the semantic regularities that were generated by RIO. 5. quantitative_syntactic_regularity_analysis.csv, which contains the syntactic regularity stat analysis for the 308 processed cases. 6. quantitative_semantic_regularity_analysis.csv, which contains the semantic regularity stat analysis for the 308 processed cases.

本页面为提交至2013年国际语义网大会（ISWC 2013）的论文《QAW：基于语义规则检测的本体质量保障工作流》（QAW: A Quality Assurance Workflow for Ontologies based on Detecting Semantic Regularities）的补充材料，作者为Eleni Mikroyannidi、Manuel Quesada-Martínez、Dmitry Tsarkov、Jesualdo Tomas Fernandez Breis、Robert Stevens、Ignazio Palmisano。本研究的规则检测基于本体规则检测器（Regularity Inspector for Ontologies，RIO）框架完成。该项目为开源项目，可通过下方链接下载。本数据集包含论文中呈现的定性与定量分析所需数据。在**定性分析**中，共处理了6个词汇模式（关键词），分别为："chronic"、"acute"、"absent"、"present"、"right"、"left"。针对这些关键词，用户可浏览并下载以下数据： 1. 通用命名格式为`keyword_syntactic_usage.xml`的XML文件，内含针对标签中包含对应关键词的实体，其引用的已断言公理所检测到的句法规则。总计应生成6个文件（对应每个关键词）。 2. 通用命名格式为`keyword_semantic_usage.xml`的XML文件，内含针对标签中包含该关键词的实体，其引用的已断言公理所检测到的语义规则。 3. 通用命名格式为`keyword_syntactic_usage_readable.txt`的文本文件，采用更易读的格式，附带实体标签渲染后的句法规则内容。 4. 通用命名格式为`keyword_semantic_usage_readable.txt`的文本文件，采用更易读的格式，附带实体标签渲染后的语义规则内容。在**定量分析**中，共处理了308个词汇模式，并检测到对应的句法与语义规则。面向用户开放的数据集包含以下内容： 1. **LexAnal_Snomed_2013_NoSensitiveAnalysis_Cov_0.1_100.0.xml**：内含2013年1月版SNOMED-CT中可被检测到的全部词汇模式。 2. **Snomed_2013_LexAnal_Full_0.1-0.4Perc_.xml**：内含全部词汇模式占比为0.1%~0.4%的数据集。 3. **syntactic_regularities_dataset.zip**：内含308个由RIO生成的句法规则XML文件。 4. **semantic_regularities_dataset.zip**：内含308个由RIO生成的语义规则XML文件。 5. **quantitative_syntactic_regularity_analysis.csv**：内含针对308个处理案例的句法规则统计分析结果。 6. **quantitative_semantic_regularity_analysis.csv**：内含针对308个处理案例的语义规则统计分析结果。

提供机构：

figshare

创建时间：

2016-01-11

5,000+

优质数据集

54 个

任务类型

进入经典数据集