ZeroCostDL4Mic / DeepBacs - Multi-label U-Net training dataset (Bacillus subtilis) and pretrained model
收藏Mendeley Data2024-03-27 更新2024-06-28 收录
下载链接:
https://zenodo.org/record/5639253
下载链接
链接失效反馈官方服务:
资源简介:
Training and test images of live B. subtilis cells expressing FtsZ-GFP for the task of segmentation. Additional information can be found on this github wiki. The example shows the fluorescence widefield image of live B. subtilis cells expressing FtsZ-GFP, the manually annotated instance segmentation mask and the corresponding 2-label semantic segmentation mask used for model training. Training and test dataset Data type: Paired fluorescence and segmented mask images Microscopy data type: 2D widefield images (fluorescence) Microscope: Custom-built 100x inverted microscope bearing a 100x TIRF objective (Nikon CFI Apochromat TIRF 100XC Oil); images were captured on a Prime BSI sCMOS camera (Teledyne Photometrics) Cell type: B. subtilis strain SH130 grown under agarose pads File format: .tiff (8-bit) Image size: 1024 x 1024 px² (Pixel size: 65 nm) Image preprocessing: Images were denoised using PureDenoise and resulting 32-bit images were converted into 8-bit images after normalizing to 1% and 99.98% percentiles. Images were manually annotated using the Labkit Fiji plugin and mask images with labeled cytosol and cell boundaries were created using a custom Fiji macro (see our github repository). Multi-label U-Net model: The U-Net (2D) multilabel model was generated using the ZeroCostDL4Mic platform (Chamier & Laine et al., 2021). It was trained from scratch for 200 epochs on 733 paired image patches (image dimensions: (1024 x 1024 px²), patch size: (256 x 256 px²)) with a batch size of 8 and a categorical_crossentrop loss function, using the U-Net (2D) multilabel ZeroCostDL4Mic notebook (v 1) (Chamier & Laine et al., 2021). Key python packages used include tensorflow (v 0.1.12), Keras (v 2.3.1), numpy (v 1.19.5), cuda (v 11.1.105). The training was accelerated using a Tesla P100GPU. Author(s): Mia Conduit1,2, Séamus Holden1,3 Contact email: Seamus.Holden@newcastle.ac.uk Affiliation: 1) Centre for Bacterial Cell Biology, Biosciences Institute, Newcastle University, NE2 4AX UK 2) ORCID: 0000-0002-7169-907X Associated publications: Whitley et al., 2021, Nature Communications, https://doi.org/10.15252/embj.201696235
本数据集用于分割任务,包含表达FtsZ-GFP的活枯草芽孢杆菌(B. subtilis)细胞的训练与测试图像,更多详细信息可查阅对应GitHub维基页面。
示例展示了表达FtsZ-GFP的活枯草芽孢杆菌细胞的荧光宽场图像、人工标注的实例分割掩码(instance segmentation mask),以及用于模型训练的对应二标签语义分割掩码(semantic segmentation mask)。
### 训练与测试数据集基本信息
数据类型:配对荧光图像与分割掩码图像
显微数据类型:二维宽场荧光图像
显微镜系统:定制化100倍倒置显微镜,搭载100倍全内反射荧光(TIRF)物镜(尼康CFI Apochromat TIRF 100XC 油镜);图像通过Prime BSI sCMOS相机(Teledyne Photometrics公司)采集获取
细胞样本:在琼脂糖垫上培养的枯草芽孢杆菌菌株SH130
文件格式:.tiff(8位)
图像尺寸:1024 × 1024 像素²(单像素尺寸:65 nm)
图像预处理流程:使用PureDenoise工具对原始图像进行去噪处理,将得到的32位图像按1%与99.98%百分位数进行归一化后,转换为8位图像。采用Fiji软件的Labkit插件完成人工标注,并通过定制化Fiji宏脚本生成带有胞质与细胞边界标注的掩码图像(详见本研究的GitHub仓库)。
### 多标签U-Net模型
本研究借助ZeroCostDL4Mic平台(Chamier & Laine等,2021)构建二维多标签U-Net模型。基于U-Net(2D)多标签ZeroCostDL4Mic笔记本(v1)(Chamier & Laine等,2021),从随机初始化状态开始训练200个epoch:训练集包含733张配对图像块(原始图像尺寸为1024 × 1024 像素²,图像块尺寸为256 × 256 像素²),批次大小为8,损失函数为分类交叉熵(categorical_crossentrop)。所用核心Python库包括tensorflow(v0.1.12)、Keras(v2.3.1)、numpy(v1.19.5)、cuda(v11.1.105)。训练过程通过Tesla P100 GPU加速。
### 作者与联系方式
作者:Mia Conduit¹,², Séamus Holden¹,³
联系邮箱:Seamus.Holden@newcastle.ac.uk
所属机构:
1) 英国纽卡斯尔大学生物科学学院细菌细胞生物学中心,邮编NE2 4AX
2) ORCID:0000-0002-7169-907X
### 关联出版物
Whitley等,2021,《自然·通讯》,https://doi.org/10.15252/embj.201696235
创建时间:
2023-06-28
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集包含用于分割任务的活体B. subtilis细胞表达FtsZ-GFP的训练和测试图像,采用2D宽场荧光显微镜拍摄,图像大小为1024x1024像素,文件格式为8位.tiff。数据集还包括一个基于ZeroCostDL4Mic平台训练的多标签U-Net模型,该模型使用733个图像块进行训练,训练参数包括200个epochs、批量大小为8,并使用了Tesla P100GPU加速。
以上内容由遇见数据集搜集并总结生成



