VMXi Classification Dataset: Micrographs of Protein Crystallisation Experiments with Labels of Experimental Outcomes

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://zenodo.org/record/11097395

下载链接

链接失效反馈

官方服务：

资源简介：

The VMXi Classification Dataset consists of images of protein crystallisation experiments collected on a Rock Imager 1000 (Formulatrix, USA) automated microplate imager at the VMXi experimental facility at Diamond Light Source, UK. These images were used to train the CHiMP (Crystal Hits in My Plate) Classifier network that classifies images into categories of experimental outcome. The directory named "VMXi_Classification_Images", consists of 18,782 JPEG images with a resolution of 3376 × 2704 pixels. 13,951 of these images are associated with a label describing the experimental outcome depicted in the images. The labels are Clear, Crystals, Precipitate or Other. The file "VMXi_Classification_Train.csv" contains filenames and labels for the 11,161 images in the training set used for the CHiMP Classifier network. The file "VMXi_Classification_Validation.csv" contains filenames and labels for the 2,790 images in the validation set used for the CHiMP Classifier network. In addition, an independent test set of images are included in the directory named "VMXI_Classification_Test_Dataset". Within this direcectory: The subdirectory named "VMXi_Classification_Test_Images" contains 1000 JPEG images with a resolution of 3376 × 2704 pixels. Each image is associated with a label describing the experimental outcome depicted in the images. The labels are Clear, Crystals, Precipitate or Other. The file "unambiguous_test_dataset.csv" contains filenames and labels for the 632 images in the test set where three experts independently agreed on a label. The file "mostly_unambiguous_test_dataset.csv" contains filenames and labels for the 949 images in the test set where at least two experts independently agreed on a label. The file "original_expert_labels.csv" contains filenames for all 1000 images and the labels given by three experts independently. The column headed "expert_1_1" refers to labels given by expert number 1 at a timepoint 6 months prior to categorising the images again, given in the column "expert_1_2". The columns "expert_2" and "expert_3" contain the labels given by experts 2 and 3 respectively. The unambiguous and mostly unambiguous test sets were created from the categories chosen by "expert_1_2", "expert_2" and "expert_3"

创建时间：

2024-09-09

5,000+

优质数据集

54 个

任务类型

进入经典数据集