five

N3C-Formatted OMOP2OBO Mappings

收藏
NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://zenodo.org/record/7249165
下载链接
链接失效反馈
官方服务:
资源简介:
OMOP2OBO Mappings - N3C OMOP to OBO Working group This repository stores OMOP2OBO mappings which have been processed for use within the National COVID Cohort Collaborative (N3C) Enclave. The version of the mappings stored in this repository have been specifically formatted for use within the N3C Enclave. N3C OMOP to OBO Working Group: https://covid.cd2h.org/ontology   Accessing the N3C-Formatted Mappings  You can access the three OMOP2OBO HPO mapping files in the Enclave from the Knowledge store using the following link: https://unite.nih.gov/workspace/compass/view/ri.compass.main.folder.1719efcf-9a87-484f-9a67-be6a29598567. The mapping set includes three files, but you only need to merge the following two files with existing data in the Enclave in order to be able to create the concept sets: OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_expression_items.csv OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_version.csv   The first file OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_expression_items.csv, contains columns for the OMOP concept ids and codes as well as specifies information like whether or not the OMOP concept’s descendants should be included when deriving the concept sets (defaults to FALSE). The other file OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_version.csv, contains details on the mapping’s label (i.e., the HPO curie and label in the concept_set_id field) and its provenance/evidence (the specific column to access for this information is called intention).   Creating Concept Sets Merge these files together on the column named codeset_id and then join them with existing Enclave tables like concept and condition_occurrence to populate the actual concept sets. The name of the concept set can be obtained from the OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_version.csv file and is stored as a string in the column called concept_set_id. Although not ideal (but is the best way to approach this currently given what fields are available in the Enclave), to get the HPO CURIE and label will require applying a regex to this column. An example mapping is shown below (highlighting some of the most useful columns): codeset_id: 900000000 concept_set_id: [OMOP2OBO] hp_0002031-abnormal_esophagus_morphology concept: 23868 code: 69771008 codeSystem: SNOMED includeDescendants: False intention: Mixed - This mapping was created using the OMOP2OBO mapping algorithm (https://github.com/callahantiff/OMOP2OBO). The Mapping Category and Evidence supporting the mappings are provided below, by OMOP concept: 23868 ******* Mapping Category: Automatic Exact - Concept ------------------------------------------------ Mapping Provenance ------------------ OBO_DbXref-OMOP_ANCESTOR_SOURCE_CODE:snomed_69771008 | OBO_DbXref-OMOP_CONCEPT_SOURCE_CODE:snomed_69771008 | CONCEPT_SIMILARITY:HP_0002031_0.713   Release Notes - v2.0.0 Preparation In order to import data into the Enclave, the following items are needed: Obtain API Token, which will be included in the authorization header (stored as GitHub Secret) Obtain username hash from the Enclave OMOP2OBO Mappings (v1.5.0) Data Concept Set Container (concept_set_container): CreateNewConceptSet Concept Set Version (code_sets): CreateNewDraftOMOPConceptSetVersion Concept Set Expression Items (concept_set_version_item): addCodeAsVersionExpression Script n3c_mapping_conversion.py Generated Output Need to have the codeset_id filled from self-generation (ideally, from a conserved range) prior to beginning any of the API steps. The current list of assigned identifiers is stored in the file named omop2obo_enclave_codeset_id_dict_v2.0.0.json. Note that in order to accommodate the 1:Many mappings the codeset ids were re-generated and rather than being ampped to HPO concepts, they are mapped to SNOMED-CT concepts. This creates a cleaner mapping and will easily scale to future mapping builds.   To be consistent with OMOP tools, specifically Atlas, we have also created Atlas-formatted json files for each mapping, which are stored in the zipped directory named atlas_json_files_v2.0.0.zip. Note that as mentioned above, to enable the representation of 1:Many mappings the filenames are no longer named after HPO concepts they are now named with the OMOP concept_id and label and additional fields have been added within the JSON files that includes the HPO ids, labels, mapping category, mapping logic, and mapping evidence.   File 1: concept_set_container Generated Data: OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_container.csv Columns: concept_set_id concept_set_name intention assigned_informatician assigned_sme project_id status stage n3c_reviewer alias archived created_by created_at   File 2: concept_set_expression_items Generated Data: OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_expression_items.csv    Columns: codeset_id concept_id code codeSystem ontology_id ontology_label mapping_category mapping_logic mapping_evidence isExcluded includeDescendants includeMapped item_id annotation created_by created_at   File 3: concept_set_version Generated Data: OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_version.csv    Columns: codeset_id concept_set_id concept_set_version_title project source_application source_application_version created_at atlas_json most_recent_version comments intention limitations issues update_message status has_review reviewed_by created_by provenance atlas_json_resource_url parent_version_id is_draft   Generated Output: OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_container.csv OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_expression_items.csv OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_version.csv atlas_json_files_v2.0.0.zip omop2obo_enclave_codeset_id_dict_v2.0.0.json
创建时间:
2022-10-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作