five

Preliminary Mitosis Detection Results for TCGA-BRCA Dataset

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10245706
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset provides mitosis detection results employing the "Mitosis Detection, Fast and Slow" (MDFS) algorithm [[2208.12587] Mitosis Detection, Fast and Slow: Robust and Efficient Detection of Mitotic Figures (arxiv.org)] on the TCGA-BRCA dataset.  The MDFS algorithm exemplifies a robust and efficient two-stage process for mitosis detection. Initially, potential mitotic figures are identified and later refined. The proposed model for the preliminary identification of candidates, the EUNet, stands out for its swift and accurate performance, largely due to its structural design. EUNet operates by outlining candidate areas at a lower resolution, significantly expediting the detection process. In the second phase, the initially identified candidates undergo further refinement using a more intricate classifier network, namely the EfficientNet-B7. The MDFS algorithm was originally developed for the MIDOG challenges.   Viewing in QuPath The dataset at hand comprises GeoJSON files in two categories: mitosis and proxy (mimicker -- the candidates that are unlikely to be mitosis based on our algorithm). Users can open and visualize each category overlaid on the Whole Slide Image (WSI) using QuPath. Simply drag and drop the annotation file onto the opened image in the program. Additionally, users can employ the provided Python snippet to read the annotation into a Python dictionary or a Numpy array.   Loading in Python To load the GeoJSON files in Python, users can use the following code: import json import numpy as np import pandas as pd   def load_geojson(filename):   # Load the GeoJSON file   with open(filename, 'r') as f:       data = json.load(f)     # Extract the properties and store in a dictionary   slide_properties = data["properties"]     # Convert the points to a numpy array   points_np = np.array([(feat['geometry']['coordinates'][0], feat['geometry']['coordinates'][1], feat['properties']['score']) for feat in data['features']])     # Convert the points to a pandas DataFrame   points_df = pd.DataFrame(points_np, columns=['x', 'y', 'score'])     return slide_properties, points_np, points_df     # Use the function to load mitosis data mitosis_properties, mitosis_points_np, mitosis_points_df = load_geojson('mitosis.geojson')   # Use the function to load mimickers data mimickers_properties, mimickers_points_np, mimickers_points_df = load_geojson('mimickers.geojson') Properties Each WSI in the dataset includes the candidate's centroid, bounding box, hotspot location, hotspot mitotic count, and hotspot mitotic score. The structures of the mitosis and mimicker property dictionaries are as follows: Mitosis property dictionary structure: mitosis_properties = {    'slide_id': slide_id,    'slide_height': img_h,    'slide_width': img_w,    'wsi_mitosis_count': num_mitosis,    'mitosis_threshold': 0.5,    'hotspot_rect': {'x1': hotspot[0], 'y1': hotspot[1], 'x2': hotspot[2], 'y2': hotspot[3]},    'hotspot_mitosis_count': mitosis_count,    'hotspot_mitosis_score': mitosis_score, }   Proxy figure (mimicker) property dictionary structure: mimicker_properties = {    'slide_id': slide_id,    'slide_height': img_h,    'slide_width': img_w,    'wsi_mimicker_count': num_mimicker,    'mitosis_threshold': 0.5, } Disclaimer: It should be noted that we did not conduct a comprehensive review of all mitotic figures within each WSI, and we do not purport these to be free of errors. Nonetheless, a pathologist examined the resultant hotspot regions of interest from 757 WSIs within the TCGA-BRCA Mitosis Dataset where we found strong correlations between pathologist and MDFS mitotic counts  (r=0.8, p$<$0.001). Furthermore, MDFS-derived mitosis scores are shown to be as prognostic as pathologist-assigned mitosis scores [1]. This examination was also aimed at verifying the quality of the selections, ensuring excessive false detections or artifacts did not primarily drive them and were in a plausible location in the tumor landscape.   [1] Ibrahim, Asmaa, et al. "Artificial Intelligence-Based Mitosis Scoring in Breast Cancer: Clinical Application." Modern Pathology 37.3 (2024): 100416.
创建时间:
2024-02-21
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作