Preliminary Mitosis Detection Results for TCGA-BRCA Dataset
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10245706
下载链接
链接失效反馈官方服务:
资源简介:
This dataset provides mitosis detection results employing the "Mitosis Detection, Fast and Slow" (MDFS) algorithm [[2208.12587] Mitosis Detection, Fast and Slow: Robust and Efficient Detection of Mitotic Figures (arxiv.org)] on the TCGA-BRCA dataset.
The MDFS algorithm exemplifies a robust and efficient two-stage process for mitosis detection. Initially, potential mitotic figures are identified and later refined. The proposed model for the preliminary identification of candidates, the EUNet, stands out for its swift and accurate performance, largely due to its structural design. EUNet operates by outlining candidate areas at a lower resolution, significantly expediting the detection process. In the second phase, the initially identified candidates undergo further refinement using a more intricate classifier network, namely the EfficientNet-B7. The MDFS algorithm was originally developed for the MIDOG challenges.
Viewing in QuPath
The dataset at hand comprises GeoJSON files in two categories: mitosis and proxy (mimicker -- the candidates that are unlikely to be mitosis based on our algorithm). Users can open and visualize each category overlaid on the Whole Slide Image (WSI) using QuPath. Simply drag and drop the annotation file onto the opened image in the program. Additionally, users can employ the provided Python snippet to read the annotation into a Python dictionary or a Numpy array.
Loading in Python
To load the GeoJSON files in Python, users can use the following code:
import json
import numpy as np
import pandas as pd
def load_geojson(filename):
# Load the GeoJSON file
with open(filename, 'r') as f:
data = json.load(f)
# Extract the properties and store in a dictionary
slide_properties = data["properties"]
# Convert the points to a numpy array
points_np = np.array([(feat['geometry']['coordinates'][0], feat['geometry']['coordinates'][1], feat['properties']['score']) for feat in data['features']])
# Convert the points to a pandas DataFrame
points_df = pd.DataFrame(points_np, columns=['x', 'y', 'score'])
return slide_properties, points_np, points_df
# Use the function to load mitosis data
mitosis_properties, mitosis_points_np, mitosis_points_df = load_geojson('mitosis.geojson')
# Use the function to load mimickers data
mimickers_properties, mimickers_points_np, mimickers_points_df = load_geojson('mimickers.geojson')
Properties
Each WSI in the dataset includes the candidate's centroid, bounding box, hotspot location, hotspot mitotic count, and hotspot mitotic score. The structures of the mitosis and mimicker property dictionaries are as follows:
Mitosis property dictionary structure:
mitosis_properties = {
'slide_id': slide_id,
'slide_height': img_h,
'slide_width': img_w,
'wsi_mitosis_count': num_mitosis,
'mitosis_threshold': 0.5,
'hotspot_rect': {'x1': hotspot[0], 'y1': hotspot[1], 'x2': hotspot[2], 'y2': hotspot[3]},
'hotspot_mitosis_count': mitosis_count,
'hotspot_mitosis_score': mitosis_score,
}
Proxy figure (mimicker) property dictionary structure:
mimicker_properties = {
'slide_id': slide_id,
'slide_height': img_h,
'slide_width': img_w,
'wsi_mimicker_count': num_mimicker,
'mitosis_threshold': 0.5,
}
Disclaimer:
It should be noted that we did not conduct a comprehensive review of all mitotic figures within each WSI, and we do not purport these to be free of errors. Nonetheless, a pathologist examined the resultant hotspot regions of interest from 757 WSIs within the TCGA-BRCA Mitosis Dataset where we found strong correlations between pathologist and MDFS mitotic counts (r=0.8, p$<$0.001). Furthermore, MDFS-derived mitosis scores are shown to be as prognostic as pathologist-assigned mitosis scores [1]. This examination was also aimed at verifying the quality of the selections, ensuring excessive false detections or artifacts did not primarily drive them and were in a plausible location in the tumor landscape.
[1] Ibrahim, Asmaa, et al. "Artificial Intelligence-Based Mitosis Scoring in Breast Cancer: Clinical Application." Modern Pathology 37.3 (2024): 100416.
创建时间:
2024-02-21



