10 CLASS PREDICTION AND ALIGNMENT
收藏Zenodo2026-05-16 更新2026-05-26 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.20241355
下载链接
链接失效反馈官方服务:
资源简介:
This code is preceded by DETECT_LEAVES found at: https://doi.org/10.5281/zenodo.20090333
Using LEAF_MASKS predicted by DETECT_LEAVES, this code predicts 10 classes of lobe and vein identity, including the mid, distal, and proximal lobes, the mid, distal, and proximal primary veins, the mid, distal, and proximal secondary veins, and the petiolar junction. 10_class_prediction/ predicts these classes on unaligned leaf masks. Using the predicted petiolar junction and midvein, leaves are aligned (pointing downwards, by ampelographic tradition). Once leaves have been aligned, the model to predict the 10 classes is run again with ALIGNED_10_class_predictions/, but now on aligned leaves, improving performance. Code for figures and table included.
10_class_prediction/ # Prediction of 10 classes on unaligned leaf masks├── 0A_training_masks.py # Creates blade and vein masks to define features├── 0B_training_features.py # Using landmarks creates feature mask├── 0C_training_ect.py # Creates ECTs of leaf masks for training data├── 0D_prepare_training_data.py # Prepares train/val/test training data for YOLO model├── 1_train_yolo.py # Trains yolo26n-seg.pt model├── 2_bulk_ect_inference.py # Creates ECTs from LEAF_MASKS to infer on├── 3_bulk_inference.py # Inference on LEAF_MASKS and their ECTs├── 4_bulk_alignment.py # Using petiolar junction and midveins, aligns all LEAF_MASKS and their ECTs├── data/ # necessary data inputs to run code ├── current_model_training_data/ # Landmarks and leaves to discard for this model └── previous_model_training_data/ # Previously inferred training data: https://zenodo.org/records/20090333├── data.yaml # data.yaml for model training├── outputs/ # code outputs ├── training_data/ # Preparation of training data for model └── training_data_for_model/ # Training data in train/val/test form for YOLO├── runs/ # Training logs, metrics, performance of YOLO model, including best.pt└── yolo26n-seg.pt # Initial starting model
LEAF_MASKS/ # Individually segmented leaf masks (unaligned)├── ACADIA/ # https://zenodo.org/records/20091083├── ALGERIA/ # https://zenodo.org/records/20091083├── BALEARIC_ISLANDS/ # https://zenodo.org/records/20091083├── CALIFORNIA/ # https://zenodo.org/records/20091083├── CHAMBOURCIN/ # https://zenodo.org/records/20091083├── COTTON/ # https://zenodo.org/records/20091083├── GENEVA/ # https://zenodo.org/records/20091262├── HORIZONxILLINOIS/ # https://zenodo.org/records/20091083├── MISION/ # https://zenodo.org/records/20091083├── TAMU/ # https://zenodo.org/records/20091083├── UCDAVIS/ # https://zenodo.org/records/20091083├── VITIS_CROSSES/ # https://zenodo.org/records/20091083└── WOLFSKILL/ # https://zenodo.org/records/20091262
MASTER_LEAF_METADATA.csv # Metadata associated with LEAF_MASKS
LEAF_ECT/ └──transformed_ect/ # ECT of LEAF_MASKS for inference
LEAF_LOBES/ # Inferred masks of LEAF_MASKS
LEAF_LOBES_OVERLAY/ # Inferred masks of LEAF_MASKS overlayed on original image
ALIGNED_LEAF_MASKS/ # aligned LEAF_MASKS: https://zenodo.org/records/20242260
ALIGNED_LEAF_ECT/ # Aligned ECT to aligned ALIGNED_LEAF_MASKS for inference
ALIGNED_10_class_predictions/ # Prediction of 10 classes on aligned leaf masks├── 0_prepare_training_data.py # Prepares aligned training data├── 1_train_aligned_yolo.py # Trains model on aligned data├── 2_bulk_aligned_inference.py # Inference on ALIGNED_LEAF_MASKS├── 3_figures_and_tables.py # Prepares figures and tables ├── data/ # Necessary inputs for training on aligned data├── outputs/ # Outputs of code, including figures and tables├── runs/ # Training logs, metrics, performance of YOLO model, including best.pt└── yolo26n-seg.pt # Initial starting model
ALIGNED_LEAF_LOBE_OVERLAYS/ # Aligned mask predictions overlayed on original image
ALIGNED_LEAF_LOBES/ # Aligned mask predictions overlayed on original image
10_class_prediction/
Using leaf masks predicted by DETECT_LEAVES, this pipeline predicts 10 biologically interpretable classes corresponding to major lobes, primary veins, secondary veins, and the petiolar junction. Predictions are first performed on unaligned leaf masks, after which the inferred petiolar junction and midvein are used to rotate and align leaves into a standardized orientation. A second round of inference is then performed on aligned leaves, improving prediction consistency and overall model performance. The workflow integrates geometric topology, feature engineering, and instance segmentation into a fully reproducible pipeline for large-scale grapevine leaf morphometrics.
0A_training_masks.py — Generate lobe and vein features for training
Prepares training annotations and binary masks from manually traced leaf data and landmark annotations. Blade and vein polygon traces are loaded from text files and converted into rasterized masks using Matplotlib polygon filling to ensure robust handling of complex geometries. Landmark coordinates are linked to each leaf image and combined with blade and vein polygons into unified JSON annotation files. This script standardizes all source images, masks, and annotations into a consistent training dataset structure.
0B_training_features.py
Generates biologically meaningful feature masks from blade and vein annotations. Vein masks are skeletonized and converted into graph representations using NetworkX, allowing shortest-path tracing between landmarks and the petiolar junction. Primary and secondary veins are classified based on graph topology and assigned distinct color encodings. Blade regions corresponding to mid, distal, and proximal lobes are reconstructed by combining contour geometry with vein-derived boundaries. Final outputs include separate vein feature masks, blade feature masks, and tinted overlay visualizations for quality control.
0C_training_ect.py
Computes Euler Characteristic Transform (ECT) representations for training leaves. Binary leaf masks are converted into embedded graph representations, centered, normalized, and transformed into rotationally invariant topological descriptors across 360 angular directions. Polar ECT images are generated and subsequently projected back into the spatial coordinate system of the original leaf image using affine transformations. Intermediate quality-control visualizations verify geometric consistency between transformed ECT representations and original leaf outlines. Transformation metadata and affine matrices are saved to ensure complete reproducibility of all coordinate mappings.
0D_prepare_training_data.py
Constructs the final 10-class YOLO segmentation training dataset. Three-channel model inputs are assembled from grayscale leaf images, alpha masks, and transformed ECT representations. Color-coded feature masks are converted into polygon-based YOLO segmentation annotations corresponding to 10 biologically defined classes: mid, distal, and proximal lobes; mid, distal, and proximal primary veins; petiolar junction; and mid, distal, and proximal secondary veins. The dataset is automatically partitioned into train, validation, and test subsets, and visual montage panels are generated for quality assessment of model inputs and annotations.
1_train_yolo.py — Train model on unaligned leaves
Trains the 10-class segmentation model using the Ultralytics YOLO segmentation framework initialized from yolo26n-seg.pt. Training is performed at 1024 × 1024 resolution with extensive geometric augmentation including rotations, flips, and mosaic augmentation to improve robustness across leaf orientations and morphologies. Training supports checkpoint resumption, logging of metrics and visualizations, and exports best-performing model weights for downstream inference.
2_bulk_ect_inference.py — Euler Characteristic Transform Generation and Projection
Leaf masks were processed to generate Euler characteristic transform (ECT) representations that encoded global leaf topology and shape structure in a rotationally normalized space. For each input RGBA leaf image, the alpha channel was extracted and thresholded to isolate the leaf silhouette. Connected-component analysis was used to retain only the largest contiguous object, ensuring that artifacts or detached fragments were excluded from subsequent analysis.
External contours were extracted from the binary mask and converted into embedded graph representations using the ect library. Contour coordinates were centered at the origin, transformed into normalized graph coordinates, and isotropically scaled to a unit bounding radius. To maintain correspondence between the original image geometry and normalized ECT space, a robust affine transformation matrix was estimated using non-collinear point triplets sampled from the contour coordinates.
ECTs were computed using 360 directional filtrations uniformly distributed around the leaf perimeter. Threshold values were sampled linearly across the normalized radial interval, producing dense radial topological summaries of each leaf outline. Resulting ECT matrices were rendered as polar-coordinate images, with grayscale outputs used for downstream inference and inferno-colored outputs generated for visualization and quality control.
To validate geometric correspondence between original leaf outlines and transformed ECT space, transformed contour coordinates were projected into pixel space and overlaid onto the rendered ECT images. Inverse affine transformations were then applied to project ECT representations back into the coordinate system of the original leaf image. This enabled transformed ECTs, model predictions, and future anatomical annotations generated in ECT space to be accurately reprojected into the original full-resolution leaf image.
For each specimen, the pipeline generated original ECT images, transformed ECT images, projection validation overlays, and JSON metadata containing forward and inverse affine transformation matrices. These metadata files preserved the complete geometric mapping between original image coordinates and normalized ECT coordinates for downstream analyses.
3_bulk_inference.py — Unaligned 10-Class Semantic Segmentation Inference
A semantic segmentation pipeline based on the Ultralytics YOLO segmentation framework was used to predict anatomical leaf structures from unaligned leaf representations. Input images were constructed as three-channel composite tensors consisting of grayscale leaf intensity, alpha mask transparency, and transformed ECT representations. This combined local image texture, global shape geometry, and topological information within a unified input representation.
Inference was performed at a spatial resolution of 1024 × 1024 pixels using a low confidence threshold to preserve thin vascular structures and incomplete lobe boundaries. Predictions were generated for ten anatomical classes representing major lobes, primary veins, secondary veins, and the petiolar junction.
Predicted polygons were rasterized into color-coded segmentation masks using a fixed anatomical color palette. Raw segmentation outputs were saved as standalone masks and also blended with the original leaf imagery to produce overlay visualizations for qualitative inspection. The pipeline therefore generated both machine-readable anatomical predictions and interpretable visual outputs for validation.
4_bulk_alignment.py — Anatomically Guided Leaf Alignment
To reduce variation in orientation and positioning across leaves, anatomically guided alignment transformations were computed from predicted structural features. The alignment process used the petiolar junction as the rotational origin and the midvein orientation as the principal anatomical axis.
Predicted lobe masks were first converted into binary masks and cleaned using morphological erosion, connected-component analysis, and dilation operations to isolate the primary leaf structure. The centroid of the petiolar junction was identified from the predicted petiole class mask, while midvein coordinates were extracted from the predicted mid-primary vein class.
Leaf orientation was estimated by computing the median angular direction between midvein pixels and the petiolar junction centroid. Rotation matrices were then generated such that the midvein aligned vertically across all leaves. After rotation, tight bounding boxes were computed from the aligned leaf masks, and leaves were cropped and isotropically scaled into standardized 1024 × 1024 canvases with uniform padding.
The same geometric transformation was simultaneously applied to RGBA leaf masks and transformed ECT images, ensuring that all representations remained spatially synchronized. The alignment procedure therefore produced anatomically standardized datasets with consistent orientation, scale, and positioning across specimens.
ALIGNED_10_class_predictions/
Using aligned leaves and ECTs from the previuous section, this code then re-runs the 10 class prediction on aligned leaves without rotational variance, increasing performance.
0_prepare_training_data.py — Preparation of Aligned 10-Class Training Data
Aligned training datasets were generated by integrating anatomically aligned leaf masks, ECT representations, blade feature masks, and vein feature masks into unified training samples suitable for semantic segmentation.
For each aligned leaf, grayscale blade intensity, alpha transparency, and aligned ECT representations were merged into three-channel composite inputs. Anatomical labels from blade and vein feature masks were converted into polygon-based YOLO segmentation annotations. Contours corresponding to each anatomical structure were extracted independently and normalized relative to image dimensions.
Alignment transformations were recalculated directly from the anatomical feature masks to ensure consistency between training inputs and anatomical annotations. The petiolar junction and midvein again served as the primary orientation landmarks, allowing all structures to be transformed into a common anatomical coordinate system.
Training labels were generated for all ten anatomical classes, including lobes, primary veins, secondary veins, and the petiolar junction. Validation images containing aligned overlays of anatomical structures were also produced to verify annotation quality and geometric consistency.
The complete dataset was randomly partitioned into training, validation, and testing subsets using fixed random seeds to ensure reproducibility across experiments.
1_train_aligned_yolo.py — Training of the Aligned Semantic Segmentation Model
An aligned semantic segmentation model was trained using the Ultralytics YOLO segmentation architecture (yolo26n-seg.pt). Training was performed on anatomically standardized leaf representations generated from the alignment pipeline.
Training images consisted of three-channel composite representations containing grayscale blade information, alpha masks, and aligned ECT representations. Because leaves were already normalized into a consistent anatomical orientation, augmentation strategies were intentionally constrained to preserve biological realism.
Only mild rotational jitter (10 degrees) was applied during augmentation, while horizontal and vertical flipping were disabled to maintain directional anatomical consistency. Mosaic augmentation and low-level MixUp augmentation were retained to improve generalization without disrupting anatomical orientation.
Training proceeded for up to 500 epochs with early stopping based on validation performance. Model checkpoints, training statistics, and learning curves were automatically saved throughout the training process.
2_bulk_aligned_inference.py — Aligned Semantic Segmentation Inference
The aligned segmentation model was subsequently applied to anatomically standardized leaves and aligned ECT representations. Input tensors again consisted of grayscale blade intensity, alpha transparency, and aligned ECT information.
Inference outputs were generated as polygon-based semantic segmentations for all anatomical classes. To improve structural continuity and remove thin bridge artifacts between neighboring structures, predictions were processed independently on a per-class basis.
For each anatomical class, polygons were rasterized into temporary binary masks and cleaned using morphological opening operations with cross-shaped structuring elements. This approach selectively removed narrow spurious connections while preserving thin linear vascular structures such as secondary veins.
Cleaned class masks were recombined into unified anatomical segmentation maps and saved as color-coded outputs. Overlay visualizations were additionally generated by blending predictions with aligned leaf imagery, allowing direct visual comparison between anatomical structures and original leaf morphology.
3_figures_and_tables.py — Automated Figure and Validation Metric Generation
Automated pipelines were developed to generate publication-quality figures and validation summaries directly from model outputs and training logs.
Composite figure panels were assembled from matched image sets representing original image features, ECT representations, unaligned segmentation predictions, and aligned segmentation predictions. Image subsets were reproducibly sampled using fixed random seeds to ensure consistency across figure generation runs. Landmark annotations were optionally overlaid onto image panels when corresponding landmark metadata were available.
Model training performance was summarized using line plots generated from YOLO training logs. Precision, recall, mAP50, and mAP50-95 metrics were extracted separately for bounding-box and segmentation-mask performance. Comparative plots were generated for blade-only detection models, unaligned 10-class segmentation models, and aligned 10-class segmentation models.
Validation summaries were additionally parsed from YOLO-generated text reports and converted into structured metric tables containing class-specific segmentation statistics. Anatomical classes were reordered into biologically interpretable structural groupings to facilitate comparison between aligned and unaligned models.
提供机构:
Zenodo
创建时间:
2026-05-16



