SegPC-2021: Segmentation of Multiple Myeloma Plasma Cells in Microscopic Images
收藏ieee-dataport.org2025-03-25 收录
下载链接:
https://ieee-dataport.org/open-access/segpc-2021-segmentation-multiple-myeloma-plasma-cells-microscopic-images
下载链接
链接失效反馈官方服务:
资源简介:
Of late, efforts are underway to build computer-assisted diagnostic tools for cancer diagnosis via image processing. Such computer-assisted tools require capturing of images, stain color normalization of images, segmentation of cells of interest, and classification to count malignant versus healthy cells. This dataset is positioned towards robust segmentation of cells which is the first stage to build such a tool for plasma cell cancer, namely, Multiple Myeloma (MM), which is a type of blood cancer. The images are provided after stain color normalization. The problem of plasma cell segmentation in MM is challenging owing to multiple reasons- 1) There is a varying amount of nucleus and cytoplasm from one cell to another. 2) The cells may appear in clusters or as isolated single cells. 3) The cells appearing in clusters may have three cases- (a) cytoplasm of two cells touch each other (b) the cytoplasm of one cell and nucleus of another touch each other, (c) nucleus of cells touch each other. Since the cytoplasm and nucleus have different colors, the segmentation of cells may pose challenges. 4) There may be multiple cells touching each other in the cluster. 5) There may be unstained cells, say a red blood cell underneath the cell of interest, changing its color and shade. 6) The cytoplasm of a cell may be close to the background of the whole image, making it difficult to identify the boundary of the cell and segment it. Hence, the problem is very challenging and interesting. This is an effort towards building an automated pipeline for cancer detection in Multiple Myeloma. The current dataset has 775 images including the above images but captured from two cameras so that researchers can build methods that are invariant to cameras used. Data annotation, both nucleus and cytoplasm are marked separately unlike the previous dataset that had the complete cells as marked. Interested researchers can propose deep learning based or advanced machine learning based solutions for plasma cell segmentation using this dataset.This data is collected from the subjects suffering from Multiple Myeloma (MM), who came with the symptoms of cancer for diagnosis and/or who are under treatment at the AIIMS, New Delhi, India. Microscopic images were captured from bone marrow aspirate slides of patients diagnosed with MM. MM is a type of white blood cancer, where the plasma cells of blood are involved. Slides were stained using Jenner-Giemsa stain and plasma cells are required to be segmented. Images were captured in raw BMP format using two cameras:1) with a size of 2040x1536 pixels using cellSens software Version 2.1 (Olympus) attached to the microscope and2) at a size of 1920x2560pixels from a Nikon camera attached to the microscope. A total of 775 images are stain color normalized using our in-house methodology. These are divided into the 1) training set of 298 images, 2) Validation set of 200 images, and the test set of 277 images. The dataset was used in the IEEE ISBI 2021 medical image challenge dataset. The leaderboard of the challenge is active. The ground truth of the training and validation dataset are provided, while the GT of the test set will not be shared. The researchers can check the performance on the test dataset by uploading results at the leaderboard at https://segpc-2021.grand-challenge.org/evaluation/final-test-phase/leaderboard/. Full details are available in the readme file.
近期,致力于通过图像处理手段构建辅助癌症诊断工具的工作正在展开。此类辅助工具的构建需包括图像的捕捉、图像染色色彩的标准化、感兴趣细胞的分割以及分类,以区分恶性细胞与正常细胞。本数据集旨在实现细胞的稳健分割,这是构建针对浆细胞癌,即多发性骨髓瘤(MM)的辅助工具的第一阶段,而多发性骨髓瘤是一种血液系统恶性肿瘤。图像在染色色彩标准化后提供。由于多种原因,多发性骨髓瘤中浆细胞分割的问题极具挑战性:1)不同细胞之间的细胞核和细胞质含量存在差异。2)细胞可能成簇出现或为单个独立细胞。3)成簇的细胞可能出现三种情况:(a)两个细胞的细胞质相互接触;(b)一个细胞的细胞质与另一个细胞的细胞核接触;(c)细胞的细胞核相互接触。由于细胞质和细胞核颜色不同,细胞的分割可能面临挑战。4)簇中可能存在多个相互接触的细胞。5)可能存在未染色的细胞,例如在感兴趣细胞下方的红细胞,其颜色和色调的变化。6)细胞的细胞质可能与整个图像的背景非常接近,这使得识别细胞的边界并进行分割变得困难。因此,该问题既具有挑战性又充满趣味。这是构建多发性骨髓瘤癌症检测自动化流程的一次尝试。当前数据集包含775张图像,这些图像由两个相机捕捉,以便研究者构建对相机使用不变的算法。数据标注方面,与之前的数据集不同,本数据集将细胞核和细胞质分别进行标记。感兴趣的科研人员可以使用基于深度学习或高级机器学习的方法,利用本数据集对浆细胞进行分割。这些数据收集自患有多发性骨髓瘤(MM)的患者,他们因癌症症状前来诊断,或正在印度新德里AIIMS接受治疗。使用Jenner-Giemsa染色对患者的骨髓穿刺切片进行染色,并需对浆细胞进行分割。图像使用两种相机以原始BMP格式捕捉:1)使用连接显微镜的cellSens软件版本2.1(Olympus)捕捉,分辨率为2040x1536像素;2)使用连接显微镜的尼康相机捕捉,分辨率为1920x2560像素。总共775张图像使用我们内部的方法进行染色色彩标准化。这些图像被分为1)训练集298张图像,2)验证集200张图像,以及3)测试集277张图像。该数据集被用于IEEE ISBI 2021医学图像挑战赛的数据集。训练集和验证集的地面真实值已提供,而测试集的地面真实值将不予分享。研究人员可以通过在https://segpc-2021.grand-challenge.org/evaluation/final-test-phase/leaderboard/的排行榜上上传结果来检查测试数据集的性能。详细信息可在readme文件中找到。
提供机构:
ieee-dataport.org



