De-identified EHR for Obstetric and Maternal Care Dataset
收藏NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://data.mendeley.com/datasets/cxjrb79y63
下载链接
链接失效反馈官方服务:
资源简介:
The dataset contains anonymized clinical information standardized using the OMOP Common Data Model version 5.4, corresponding to the clinical profiles of 23,879 women aged 18 to 47 years who received care in the Gynecology and Obstetrics Department of the Clínica Universitaria Bolivariana (CUB) between 2015 and 2017. Each patient has at least one visit related to pregnancy, childbirth, and/or the postpartum period, documented through both structured data and unstructured clinical notes.
In total, the database comprises:
200,070 clinical observations,
143,385 documented clinical conditions,
2,494,424 clinical measurements,
3,776,555 clinical notes in natural language,
all of which collectively form a comprehensive and cohesive set of obstetric clinical events linked through primary and foreign keys.
The dataset consists of seven files: six CSV tables and one .ipynb notebook containing the exploratory data analysis (EDA). Each CSV file adheres to the OMOP v5.4 Common Data Model, ensuring interoperability and enabling comparative analyses with other standardized healthcare databases. Formal descriptions of tables and attributes are available in the official OHDSI documentation, including the Concept table, which references international medical terminologies such as SNOMED CT, LOINC, and RxNorm.
The files included are:
01_Person: contains the person_id and demographic information.
02_Visit: documents each medical visit (89,893 records).
03_Observation: includes coded clinical observations (200,070 records).
04_Condition: records diagnoses and reasons for consultation (143,385 records).
05_Measurement: stores quantitative and qualitative clinical measurements (2,494,424 records).
06_Note: contains clinical notes in natural language (3,776,555 records).
07_Notebook: an .ipynb file with the exploratory data analysis.
A hierarchical relational structure underlies the dataset: each 01_Person is associated with at least one 02_Visit, which in turn links to 03_Observation, 04_Condition, 05_Measurement, and 06_Note. The person_id field functions as the central key enabling reconstruction of full clinical cases across all tables.
Because the dataset is fully standardized to OMOP v5.4 — including structured data and unstructured clinical notes — it represents a robust source for real-world evidence generation, supporting advances in research, clinical surveillance, maternal health analytics, and outcomes-driven medicine.
创建时间:
2025-12-05



