Twelve-Year Epidemiological Trends, Toxin Characterization, and Bacterial Vaginosis Associations of Enteric Pathogens Detected by Multiplex PCR
收藏Mendeley Data2026-04-18 收录
下载链接:
https://data.mendeley.com/datasets/z5gpthkmxm
下载链接
链接失效反馈官方服务:
资源简介:
This collection contains five de-identified Excel workbooks (Datasets 1–5) plus two CSV key‐files, documenting molecular diagnostic results, demographic metadata, and derived analytics for gastrointestinal (GIT) pathogen testing at Medical Diagnostic Laboratories LLC. The date range spans January 2014 to March 2025. All direct patient and sample identifiers (MDLNo and Patient-ID) have been replaced with anonymized codes.
Researchers may reproduce published analyses (χ² cross-tabulations, Poisson harmonic regression, STL decomposition, co-occurrence networks, and machine‐learning models) from Osei Sekyere et al. (2025, BioRxiv). This enables studies of prevalence, seasonality, and predictive modeling across an 121+ year period.
2. File Contents
Dataset 1. GIT-No-toxins_deidentified.xlsx
Sheet: Dataset-No-toxins
Description: All multiplex GIT panel PCR tests excluding toxin genes. Each row is one test on a unique or repeat sample. The raw date‐collected is omitted (monthly aggregation available in Dataset 5).
Dataset 2. CD-Toxins_deidentified.xlsx
Sheet: C.difficile+toxins
Description: Toxin subtyping data for C. difficile. One molecular test per row; includes both species presence and toxin gene presence.
Dataset 3. Ecoli-Shigella-toxins_deidentified.xlsx
Sheet: E.coli+Shigella+toxins
Description: Subtyping assays for E. coli (O157/Shiga‐toxin) and Shigella spp. Each row is one assay result.
Dataset 4. Statistics & ML_deidentified.xlsx
Sheet(s): (preserve original sheet names, e.g., ML-Results, Feature-Importances, etc.)
Description: Consolidated analytic outputs for machine learning models (Logistic Regression, Random Forest, XGBoost). Contains per-sample predictions (anonymized), model performance metrics, and feature‐importance data. Use to reproduce Figures D1–D5 and associated Supplementary Figures.
Dataset 5. Seasonality-Temporal dynamics_deidentified.xlsx
Sheet: Monthly_Positive_Counts
Description: Derived monthly aggregation and summary statistics for Poisson harmonic regression and STL decomposition. Reproduce all seasonal plots and numeric tables.
3. Data Provenance & Methods
Laboratory: Medical Diagnostic Laboratories LLC, Hamilton Township, NJ, USA.
Time Period: January 2014–March 2025.
Assays:
Multiplex GIT panel (bacterial, viral, protozoal pathogens; no toxin genes).
C. difficile toxin PCR for toxin A/B genes.
E. coli/Shigella subtyping (rfbA, stx1, stx2 genes
Laboratory: Medical Diagnostic Laboratories LLC, Hamilton Township, NJ, USA.
Time Period: January 2014–March 2025.
License:
Data are CC0 (public domain), permitting unrestricted reuse for research and publication.
Contact: Dr. John Osei, jod14139@yahoo.com, for questions about data.
Keywords: Gastrointestinal pathogens; multiplex PCR; Clostridioides difficile; Escherichia coli; Shigella; seasonality; Poisson regression; STL decomposition; machine learning; de-identified clinical data
创建时间:
2025-06-05



