five

Antibiotic Resistance Microbiology Dataset (ARMD): A resource for antimicrobial resistance from EHRs

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.jq2bvq8kp
下载链接
链接失效反馈
官方服务:
资源简介:
The Antibiotic Resistance Microbiology Dataset (ARMD) is a structured and de-identified resource developed using electronic health records (EHR) from Stanford Healthcare. It provides a comprehensive overview of microbiological cultures including urine, respiratory, and blood cultures. This dataset includes 283,715 unique adult patients and features detailed information on culture results, identified organisms, antibiotic susceptibility, and associated demographic and clinical data. The dataset was meticulously constructed through a multi-step process designed to enhance data quality and relevance. By enabling the study of antimicrobial resistance patterns and supporting antimicrobial stewardship efforts, ARMD offers a valuable resource for researchers and clinicians seeking to improve the management of infectious diseases and combat the growing threat of antimicrobial resistance. Methods Cohort Selection The ARMD was created using de-identified EHR data from Stanford Healthcare to address this need. This dataset provides microbiological cultures from adult patients (≥18 years old) and includes key clinical data points relevant to studying antimicrobial resistance. The cohort construction involved the following features and processes: Culture Types: Microbiological cultures were included, specifically urine, respiratory, and blood cultures. Temporal Adjustment: The timing of culture orders was adjusted for data privacy through jittering, ensuring patient confidentiality while retaining meaningful temporal relationships. Culture Positivity: Each culture is flagged as either positive or negative, indicating whether an organism was identified. Cultures flagged as negative are represented by a null value in the susceptibility field. Organism Identification and Susceptibility: For positive cultures, the identified organism and its antibiotic susceptibility are recorded. Susceptibility values were categorized using the following logic: NULL: The original susceptibility was NULL, indicating the culture was not positive (e.g., no growth). Susceptible: Includes values such as Susceptible, Negative, or Not Detected. Resistant: Includes values such as Resistant, Non Susceptible, Detected, or Positive. Intermediate: Includes values such as Intermediate or Susceptible - Dose Dependent. Inconclusive: Includes values such as No Interpretation, Not done, Inconclusive, or See Comment. Synergism: Includes values such as Synergy and No Synergy. Antibiotic Standardization: Antibiotic names were cleaned and standardized to the generic form for consistency in analysis, allowing for accurate comparisons across records. Antibiotic Susceptibility: Detailed susceptibility data is available for 55 different antibiotics, providing a robust framework for analyzing antimicrobial resistance patterns. The cohort was generated through a systematic, multi-step process to ensure high-quality data: Filtering for Clinical Relevance: Microbiological cultures associated with significant clinical outcomes were selected to focus on cases with actionable insights. Adult Patient Restriction: The dataset was limited to adult patients (≥18 years old) using demographic data. Exclusion Criteria: Patients with prior microbiological cultures within two weeks before the current culture were excluded to avoid overlapping data and ensure distinct clinical events. Identification of Culture Positivity: Positivity was determined based on the presence of susceptibility results in the corresponding records. This rigorous cohort selection process ensures that the ARMD dataset is well-suited for research on antimicrobial resistance, supporting clinical and epidemiological studies aimed at improving antimicrobial stewardship and treatment outcomes. Implied susceptibility  The Implied Susceptibility table is a derived dataset created to provide inferred insights into antibiotic susceptibility patterns based on predefined relationships between antibiotics. This table captures cases where susceptibility to one antibiotic can imply susceptibility or resistance to another, based on established microbiological and pharmacological principles. The table is designed to enhance the interpretability of susceptibility data by incorporating implied relationships between antibiotics, which can be critical for guiding clinical decision-making and understanding resistance patterns. Additionally, we share the rules applied to derive these implied relationships, providing transparency and enabling researchers to understand and reproduce the logic behind the inferred data. De-Identification To ensure patient privacy and comply with data-sharing policies, the ARMD employs the following de-identification measures: Unique Identifiers: Each patient and culture order is assigned a unique, randomly generated identifier (anon_id and order_proc_id_coded). These identifiers are consistent across the dataset and allow linkage between associated data elements while preserving anonymity. Temporal De-Identification: Dates and times are not included in their original format. Instead, all timestamps (e.g., order_time_jittered_utc) are jittered randomly to maintain temporal relationships without revealing exact times. The jittering process ensures the dataset retains analytical utility while removing direct identifiers. Age Censoring: To further ensure anonymity, patient ages are categorized into predefined age bins (e.g., 18–24, 25–34, etc.), with all patients aged 89 or older grouped into a single category (90+). This approach prevents re-identification of individuals based on age outliers. Gender Encoding: Gender is recorded as binary values (0 or 1) without defining which value corresponds to male or female, eliminating any interpretative bias and enhancing privacy. Exclusion of Direct Identifiers: No direct patient identifiers (e.g., names, medical record numbers) are included in the dataset. All demographic and clinical details are provided in a de-identified format. Ethical Approval and Patient Consent This study was approved by the Stanford University Institutional Review Board (IRB) under eProtocol #70466. The IRB determined the study involves minimal risk, and patient consent was waived due to the use of de-identified retrospective data.
创建时间:
2025-10-22
二维码
社区交流群
二维码
科研交流群
商业服务