DRIAMS: Database of Resistance Information on Antimicrobials and MALDI-TOF Mass Spectra
收藏DataONE2025-08-19 更新2025-08-23 收录
下载链接:
https://search.dataone.org/view/sha256:7d871c98194c818e1652e32deb4156965b2e25f9c5b6ad9f8242c680e9bdffc4
下载链接
链接失效反馈官方服务:
资源简介:
Early administration of effective antimicrobial treatments is critical for the outcome of infections and the prevention of treatment resistance. Antimicrobial resistance testing enables the selection of optimal antibiotic treatments, but current culture-based techniques can take up to 72 hours to generate results. We have developed a novel machine learning approach to predict antimicrobial resistance directly from MALDI-TOF mass spectra profiles of clinical samples. We trained calibrated classifiers on a newly-created publicly available database of mass spectra profiles from clinically most relevant isolates with linked antimicrobial susceptibility phenotypes. The dataset combines more than 300,000 mass spectra with more than 750,000 antimicrobial resistance phenotypes from four medical institutions. Validation against a panel of clinically important pathogens, including Staphylococcus aureus, Escherichia coli, and Klebsiella pneumoniae, resulting in AUROC values of 0.80, 0.74, and 0.74..., The DRIAMS dataset is ressource intended for antimicrobial resistance prediction from real-world clinical routine MALDI-TOF mass spectra. It is comprised of four subdatasets collected at different medical institutions across Switzerland.
For each site, the data consists of MALDI-TOF mass spectra in the form of .txt files and a meta-data file.
(i) The meta-data, incl. species and antimicrobial resistance corresponding to each spectra, is part of the \"id\" folder
(ii) The remaining folders store the MALDI-TOF mass spectra in various stages of preprocessing: \"raw\" all spectra as extracted from the MALDI-TOF MS instrument, \"preprocessed\" all spectra after the application of an established preprocessing pipeline and \"binned_6000\" all spectra after the application of an established preprocessing pipeline and binning along the mass-to-charge-ratio axis with a bin size of 3Da, resulting in 6000 feature bins.
For details on the dataset extraction, quality control, preprocessing an..., We recommend using our Python package for MALDI-TOF preprocessing and machine learning analysis, maldi-learn (https://github.com/BorgwardtLab/maldi-learn), to load and analyse DRIAMS data.
The github package comes with an elaborate README file, which gives details on installation and usage examples. In order to use this package the locations of data files and folder structure must be preserved. Please note that all four downloaded data packages should be kept in one folder, serving as the DRIAMS root folder, which then needs to be set as the DRIAMS_ROOT path in the .env file.
The folder structure obtained after download is the following:
DRIAMS
âââ DRIAMS-A
â âââ binned_6000
â âââ id
â âââ preprocessed
â âââ raw
âââ DRIAMS-B
â âââ binned_6000
â âââ id
â âââ preprocessed
â âââ raw
âââ DRIAMS-C
â âââ binned_6000
â âââ id
â âââ preprocessed
â&nb..., # DRIAMS
## Database of ResIstance against Antimicrobials with MALDI-TOF Mass Spectrometry
For each site, the data consists of MALDI-TOF mass spectra in the form of `.txt` files and a meta-data file.
(i) The meta-data, incl. species and antimicrobial resistance corresponding to each spectra, is part of the `id` folder
(ii) The remaining folders store the MALDI-TOF mass spectra in various stages of preprocessing: `raw` all spectra as extracted from the MALDI-TOF MS instrument, `preprocessed` all spectra after the application of an established preprocessing pipeline and `binned_6000` all spectra after the application of an established preprocessing pipeline and binning along the mass-to-charge-ratio axis with a bin size of 3Da, resulting in 6000 feature bins.
For details on the dataset extraction, quality control, preprocessing and properties, please refer to the Methods section in the preprint corresponding to the publication [https://doi.org/10.1101/2020.07.30.228411](https://doi.org...,
早期给予有效的抗菌治疗,对于感染的转归以及预防抗菌治疗耐药性至关重要。抗菌药物耐药性检测可助力筛选最优抗生素治疗方案,但当前基于培养的药敏检测技术往往需要长达72小时才能获得检测结果。我们开发了一种新型机器学习方法,可直接从临床样本的基质辅助激光解吸电离飞行时间质谱(MALDI-TOF)谱图中预测抗菌药物耐药性。我们基于全新构建的公开数据库,对校准后的分类器进行了训练——该数据库收录了临床最相关分离株的质谱谱图,并关联了对应的抗菌药物敏感性表型。该数据集整合了来自四家医疗机构的超过30万条质谱谱图,以及超过75万条抗菌药物耐药性表型数据。针对包括金黄色葡萄球菌(Staphylococcus aureus)、大肠埃希菌(Escherichia coli)和肺炎克雷伯菌(Klebsiella pneumoniae)在内的多种临床重要病原体开展验证后,模型的受试者工作特征曲线下面积(AUROC)分别达到0.80、0.74和0.74……
DRIAMS数据集(Database of ResIstance against Antimicrobials with MALDI-TOF Mass Spectrometry,即基于MALDI-TOF质谱的抗菌药物耐药性数据库)是一款用于基于真实临床常规MALDI-TOF质谱谱图开展抗菌药物耐药性预测的数据集资源。该数据集由瑞士多家不同医疗机构收集的四个子数据集组成。
对于每个采集站点,数据均包含以.txt文件形式存储的MALDI-TOF质谱谱图,以及一份元数据文件。
(i) 元数据(包含每条谱图对应的菌种信息与抗菌药物耐药性表型)存放在`id`文件夹中;
(ii) 其余文件夹则存储了不同预处理阶段的MALDI-TOF质谱谱图:`raw`文件夹存放从MALDI-TOF质谱仪直接导出的原始谱图;`preprocessed`文件夹存放经过标准预处理流程处理后的谱图;`binned_6000`文件夹则存放经过标准预处理流程,且沿质荷比轴以3Da为单位进行分箱(最终得到6000个特征分箱)后的谱图。
有关数据集提取、质量控制、预处理及数据集属性的详细信息,请参阅对应预印本的方法部分[https://doi.org/10.1101/2020.07.30.228411]。
我们推荐使用自研的用于MALDI-TOF质谱预处理与机器学习分析的Python工具包maldi-learn(https://github.com/BorgwardtLab/maldi-learn),来加载并分析DRIAMS数据集。该GitHub工具包附带详细的README文件,其中包含安装方法与使用示例的详细说明。使用该工具包时,需保留原始数据文件的路径与文件夹结构。请注意,需将下载的全部四个数据包存放至同一文件夹中,作为DRIAMS根目录,并在.env文件中将该路径配置为DRIAMS_ROOT。
下载后得到的文件夹结构如下:
DRIAMS
├─ DRIAMS-A
│ ├─ binned_6000
│ ├─ id
│ ├─ preprocessed
│ └─ raw
├─ DRIAMS-B
│ ├─ binned_6000
│ ├─ id
│ ├─ preprocessed
│ └─ raw
├─ DRIAMS-C
│ ├─ binned_6000
│ ├─ id
│ ├─ preprocessed
│ └─ raw
……
## 基于MALDI-TOF质谱的抗菌药物耐药性数据库
对于每个采集站点,数据均包含以.txt文件形式存储的MALDI-TOF质谱谱图,以及一份元数据文件。
(i) 元数据(包含每条谱图对应的菌种信息与抗菌药物耐药性表型)存放在`id`文件夹中;
(ii) 其余文件夹则存储了不同预处理阶段的MALDI-TOF质谱谱图:`raw`文件夹存放从MALDI-TOF质谱仪直接导出的原始谱图;`preprocessed`文件夹存放经过标准预处理流程处理后的谱图;`binned_6000`文件夹则存放经过标准预处理流程,且沿质荷比轴以3Da为单位进行分箱(最终得到6000个特征分箱)后的谱图。
有关数据集提取、质量控制、预处理及数据集属性的详细信息,请参阅对应预印本的方法部分[https://doi.org/10.1101/2020.07.30.228411]。
创建时间:
2025-08-20
搜集汇总
数据集介绍

以上内容由遇见数据集搜集并总结生成



