Deep learning software and revised 2D model to segment bone in micro-CT scans

NIAID Data Ecosystem2026-05-10 收录

下载链接：

http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.4j0zpc8qq

下载链接

链接失效反馈

官方服务：

资源简介：

Deep learning (DL) enables automated bone segmentation in micro-CT datasets but can struggle to generalize across developmental stages, anatomical regions, and imaging conditions. We present BP-2D-03, which is a revised 2D Bone-Pores segmentation model. It was trained on a new dataset comprising 20 micro-CT scans spanning five mammalian species and 142,960 image patches. To tackle the substantially larger and more varied dataset, we developed a new DL software interface with modules for training (“BONe DLFit”), prediction (“BONe DLPred”), and evaluation (“BONe IoU”). These tools addressed issues with prior pipelines, such as slice-level data leakage, high memory usage, and limited multi-GPU support. BONe’s performance was evaluated through three complementary analyses. First, 5-fold cross-validation of the baseline model (U-Net with ResNet-18 backbone and 256-px patches) assessed the effect of dataset composition on model robustness and stability, showing generally high mean Intersection-over-Union (IoU) across folds and replicates. Second, 30 benchmarking experiments tested how model architecture, encoder backbone, and patch size influence segmentation IoU and computational efficiency. U-Net and UNet++ architectures with simple convolutional backbones (e.g., ResNet-18) achieved the highest predictivity and best performance-efficiency tradeoffs, with top models reaching mean IoU values of ~0.97, whereas transformer-based and atrous-convolution models benefited from larger patches but still underperformed in mean IoU. Third, cross-platform experiments confirmed that BONe produces stable results across different hardware configurations, operating systems, and implementations (Avizo 3D and standalone). Together, these analyses demonstrate that BONe delivers robust baseline performance and reproducible results across platforms. Methods Dataset collection The deep learning dataset was assembled from three sources (Table 1). First, we included 11 micro-CT scans of long bones from the North American river otter (Lontra canadensis) that were previously analyzed by Lee et al. (2025). Second, we downloaded three scans of long bones from capybara (Hydrochoerus hydrochaeris; AMNH:Mammals:M-206440), leopard (Panthera pardus; AMNH:Mammals:M-89009), and sea otter (Enhydra lutris; ZMB:Mam:30740) from MorphoSource. Third, we collected six new micro-CT scans from a sample of laboratory mouse (Mus musculus) that is described below. Forty male C57BL/6 mice (4-wk old) were purchased from Charles River Laboratory (Wilmington, MA, USA) and maintained for 25 weeks. After the mice were euthanized, the limb bones (humerus, radius, ulna, femur, tibia, and fibula) were dissected, fixed in 10% neutral buffered formalin for 24 hours, stored in 70% ethanol. All animal care was conducted in accordance with established guidelines, and all protocols used were approved by Midwestern University’s Institutional Animal Care and Use Committee (IACUC #AZ-4205). Imaging of the mouse sample Micro-CT scanning was performed on a Nikon XT H 225 ST (Nikon Metrology Inc., Brighton, MI, USA) with settings at 120–160 kV, 58–112 µA, and 9.1–11.3 µm isotropic voxel size. Each scan consisted of the left humerus and femur from two individuals. Out the 20 scans that were collected, six were selected for the current deep learning dataset (Table 1). Table 1. Micro-CT included in the deep learning dataset Scan ID Bones 2D Tiles Voxel size (µm) Source 1R 1U HF 1,792 11.3 1 2R 2U HF 2,112 9.1 5R 5U HF 2,048 9.1 7R 7U HF 2,048 9.1 12R 12U HF 2,048 9.1 19R 19U HF 1,920 9.1 AMNH:Mammals:M-89009 H 4,250 66.8 2 AMNH:Mammals:M-206440 Mixed 1,672 120.7 3 OMNH:Mammals:44262 HRU 1,662 50.0 4 OMNH:Mammals:53994 FTFi 2,216 50.0 OMNH:Mammals:53994 HRU 1,809 50.0 UAM:Mamm:24789 FTFi 2,098 50.0 UAM:Mamm:67696 HF 1,623 50.0 UAM:Mamm:67696 TFiRU 2,321 50.0 UF:Mammals:23593 / 24550 HF 1,755 50.0 UF:Mammals:31151 HRU 1,660 50.0 UWBM:Mamm:78743 FTFi 2,150 50.0 UWBM:Mamm:81969 FTFi 2,195 50.0 UWBM:Mamm:81969 HRU 1,995 50.0 ZMB:Mam:30740 HRU 3,609 30.0 5 Bone abbreviations: F=femur; Fi=fibula; H=humerus; R=radius; T=tibia; U=ulna Museum abbreviations: AMNH=American Museum of Natural History; OMNH=Sam Noble Oklahoma Museum of Natural History; UAM=University of Alaska Museum of the North; UF=Florida Museum of Natural History; UWBM=University of Washington, Burke Museum; ZMB=Museum für Naturkunde Source abbreviations: 1= doi.org/10.5061/dryad.4j0zpc8qq; 2= ark:/87602/m4/430024; 3=ark:/87602/m4/598442; 4= doi.org/10.5061/dryad.b2rbnzsq4; 5= ark:/87602/m4/M70721 Preparing the reference masks The scans were processed in Avizo 3D 2024.2 following an established segmentation protocol (Lee et al., 2025). Bone tissue and pores were identified using Otsu thresholding, filtering, and ambient occlusion, with manual corrections where algorithms misclassified deep concavities. New Deep Learning Modules for Avizo We developed three Python-based deep learning modules for Avizo 3D 2024.2. “BONe DLFit” is a configurable model-fitting module that supports up to 20 scan–reference pairs, performs training/validation splits at the scan-level, and enables single- or multi-GPU training through PyTorch’s DataParallel. Users may choose among 2D, 2.5D, or 3D models, nine architectures, and 58 backbones (via segmentation_models_pytorch), with options for patch-based sampling, Z-score or min-max normalization, augmentation (flips, 90° rotations, brightness/contrast adjustments), and custom hyperparameters. The backend handles normalization-statistic computation using an external Python interpreter, initializes the model with user-specified weights, and trains using Adam optimization, Jaccard loss, single-cycle cosine-annealing learning-rate scheduling, and automatic mixed precision. Training and validation proceed in an epoch-based loop with on-the-fly augmentation, GPU/VRAM monitoring, and automatic saving of improved model weights. “BONe DLPred” performs inference on scans using PyTorch-formatted (PTH) models. The module loads the PTH model with its embedded metadata (architecture, backbone, normalization type, etc.), chunks the input volume in 2D, 2.5D, or 3D, normalizes the chunks, runs prediction in parallel batches (single- or multi-GPU), and reconstructs full-resolution probability maps using overlap-aware merging. Final voxel labels are assigned using user-specified confidence thresholds, and performance/benchmark statistics are reported. “BONe IoU” replaces an earlier TCL-based IoU calculator with a faster Python module that automatically computes class-wise and mean IoU. The calculation is GPU-accelerated when CUDA is available. Standalone versions of “BONe DLFit”, “BONe DLPred”, and “BONe IoU” were developed for users without Avizo. These retain the same interfaces and functionality but operate on folders of TIFF images and run in a packaged Python 3.12.11 environment. Model weights remain fully interchangeable between the Avizo and standalone versions. Fitting the baseline model: BP-2D-03 The baseline model (BP-2D-03: U-Net with ResNet-18 backbone, 2D fitting mode, 256-px patch size, and random seed 42) was fitted on Training/Validation Pool 1 (Table 2). Model fitting was performed on a high-performance workstation (“Jarvis”: dual RTX PRO 6000 Blackwell Max-Q and 512 GB RAM). Four random patches were extracted from each 2D tile (slice), resulting in a dataset comprising 120,520 training patches and 22,440 validation patches (scan-level split of 81.25:18.75). Data augmentation was enabled and included random flips, rotations in 90° increments, crops, and domain-shift transformations. Z-score normalization was performed on the patches. The model was initialized with ImageNet-trained weights. Training proceeded for 25 epochs using a batch size of 64, an initial global learning rate of 0.001 with cosine-annealing scheduling, Adam optimizer, Jaccard loss as the optimization objective, and IoU as the evaluation metric. To reduce fitting time, dual-GPU mode was enabled. Table 2. Overview of the 20 scans used for 5-fold cross-validation Order Scan ID Test Fold 1 UF_Mammals_31151_HRU 1 2 OMNH_Mammals_44262_HRU 3 2R_2U_HF 4 OMNH_Mammals_53994_HRU 5 UWBM_Mamm_81969_HRU 2 6 UWBM_Mamm_78743_FTFi 7 12R_12U_HF 8 AMNH_Mammals_M-206440_mixed 9 OMNH_Mammals_53994_FTFi 3 10 UWBM_Mamm_81969_FTFi 11 UF_Mammals_23593-24550_HF 12 UAM_Mamm_67696_HF 13 19R_19U_HF 4 14 1R_1U_HF 15 AMNH_Mammals_M-89009_F 16 7R_7U_HF 17 UAM_Mamm_24789_FTFi 5 18 5R_5U_HF 19 ZMB_Mam_30740_HRU 20 UAM_Mamm_67696_TFiRU Model Evaluation Model performance was assessed through 5-fold cross-validation, repeated across three random seeds to evaluate generalization and stability, producing 15 total models whose mIoU scores were averaged across folds. Additional experiments tested 30 combinations of architectures, backbones, and patch sizes using the most stable cross-validation split, with training conditions held constant except when VRAM limits or convergence issues required adjustments to batch size or learning rate. Related paper Lee, A. H., Moore, J. M., Vera Covarrubias, B., and Lynch, L. M. (2025). Segmentation of cortical bone, trabecular bone, and medullary pores from micro-CT images using 2D and 3D deep learning models. Anat Rec, 1–23. doi: 10.1002/ar.25633

创建时间：

2026-02-04

5,000+

优质数据集

54 个

任务类型

进入经典数据集