A2H_Clinical_Data
收藏Zenodo2026-03-31 更新2026-05-26 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.19351890
下载链接
链接失效反馈官方服务:
资源简介:
General
The two files in this ressource are used in the analysis of animal to human translation for this project: Preclinical_DrugDisease_Translation_Pipeline.
Raw AACT Snapshot
raw_aact/mv_interventional_drug_studies_20260302.csvTabular snapshot of interventional drug-related studies derived from the AACT / ClinicalTrials.gov relational database. The file was generated from a materialized view built on a database snapshot dated 1 December 2025. It includes one row per nct_id for studies with study_type = 'INTERVENTIONAL' and at least one intervention of type DRUG, DIETARY_SUPPLEMENT, BIOLOGICAL, COMBINATION_PRODUCT, GENETIC, or OTHER. The table combines study-level metadata from ctgov.studies, brief summaries from ctgov.brief_summaries, aggregated intervention names and types from ctgov.interventions, and aggregated condition names from ctgov.conditions.
Included columns:
nct_id
brief_title
study_official_title
start_date
completion_date
study_first_submitted_date
phase
overall_status
brief_summary
intervention_names
intervention_types
condition_names
Notes:
intervention_names, intervention_types, and condition_names are aggregated as pipe-separated strings (" | ").
Only studies matching the SQL selection criteria are included.
This file is intended as the structured trial metadata input for downstream entity-linking and integration steps.
Linked NER Drug and Disease Entities
linked_to_ontologies/entities_drug_disease_clin.csvNormalized drug and disease ontology annotations applied to the NER results. Disease concepts are mapped to MONDO, while drug concepts are mapped to UMLS CUIs. Multiple entities are represented as pipe-separated values (|).
Disease / condition mapping (MONDO)
merged_condition_names: Original condition names aggregated from the trial record
disease_mondo_termid: Assigned MONDO identifier
disease_mondo_term_norm: Normalized MONDO label
disease_term_mondo_clean: Cleaned disease string used for matching
disease_termid_mondo_clean: MONDO ID after cleaning step
nearest_dataset_parent_mondo: Closest parent MONDO concept in the reference dataset (-1 if none)
nearest_dataset_parent_label: Label of the nearest parent concept
merged_mondo_termid: Final merged MONDO identifier(s)
merged_mondo_label: Final merged MONDO label(s)
Drug / intervention mapping (UMLS)
ner_predicted_drugs: Drug names extracted via NER
linkbert_umls_drugs: Drug names after normalization / linking model
drug_umls_termid: UMLS concept identifiers (CUIs)
drug_umls_term_norm: Normalized UMLS labels
nearest_dataset_parent_umls: Closest parent UMLS concept (-1 if none)
nearest_dataset_parent_umls_label: Label of the parent concept
merged_umls_termid: Final merged UMLS identifier(s)
merged_umls_label: Final merged UMLS label(s)
提供机构:
Zenodo
创建时间:
2026-03-31



