five

gyoeng/IDC-h5patches-radiomics

收藏
Hugging Face2026-04-15 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/gyoeng/IDC-h5patches-radiomics
下载链接
链接失效反馈
官方服务:
资源简介:
# IDC H5 Patches + Radiomics + Cell Segmentation Dataset This dataset provides multi-modal features extracted from the HEST benchmark (IDC subset), including: * Patch-level H&E image data (H5 format) * Radiomics features * Cell segmentation outputs (CellViT) * Visualization assets for demo and analysis --- ## 📂 Dataset Structure ``` . ├── h5patches/ ├── radiomics/ └── cellsegments/ ``` ### 1. `h5patches/` Whole-slide image patches stored in HDF5 format. * `*.h5` * Each file contains patch-level image data extracted from WSI --- ### 2. `radiomics/` Radiomics features computed per patch. For each sample: ``` radiomics/{sample}_features/ ├── *_radiomics_features.parquet ├── *_radiomics_features_processed.parquet ├── *_radiomics_features_processed_stats.csv ├── patches_color.tar.gz ├── patches_gray.tar.gz └── patches_mask.tar.gz ``` * `*.parquet` → main feature tables (recommended for use) * `*_stats.csv` → feature statistics summary * `patches_*.tar.gz` → visualization patches (color / grayscale / mask) --- ### 3. `cellsegments/` Cell-level segmentation outputs generated using CellViT. For each sample: ``` cellsegments/{sample}_seg/ ├── h5_patch_cellvit_seg.parquet ├── h5_patch_cellvit_seg.geojson ├── h5_patch_cellvit_seg_meta.csv ├── overlay.tar.gz └── summary.json ``` * `*.parquet` → structured segmentation data (recommended) * `*.geojson` → polygon-based segmentation annotations * `*_meta.csv` → metadata per patch * `overlay.tar.gz` → visualization overlays (for demo/UI) * `summary.json` → dataset summary --- ## 🚀 Usage ### Load radiomics features ```python import pandas as pd df = pd.read_parquet("radiomics/NCBI783_features/NCBI783_radiomics_features.parquet") ``` ### Load segmentation data ```python seg = pd.read_parquet("cellsegments/NCBI783_seg/h5_patch_cellvit_seg.parquet") ``` ### Extract visualization patches ```bash tar -xzf patches_color.tar.gz tar -xzf overlay.tar.gz ``` --- ## ⚠️ Notes * Large files (`.h5`, `.parquet`, `.tar.gz`) are stored using **Git LFS** * It is recommended to use `.parquet` instead of `.csv` for efficient loading * Visualization assets are provided for demo purposes (e.g., Hugging Face Spaces) --- ## 📜 License This dataset uses data derived from the HEST dataset. Original dataset: https://huggingface.co/datasets/MahmoodLab/hest License: CC BY-NC-SA 4.0 This work is also distributed under the same license. --- ## 🙏 Acknowledgements * Mahmood Lab for the HEST dataset * CellViT for cell segmentation * PyRadiomics for feature extraction
提供机构:
gyoeng
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作