Pixel-wise Gaussian AD visualization (ResNet18, MVTecAD)
收藏Mendeley Data2024-05-10 更新2024-06-28 收录
下载链接:
https://zenodo.org/records/7937978
下载链接
链接失效反馈官方服务:
资源简介:
A multivariate Gaussian model is fitted with the pixel-wise feature vectors [1] (all pixel positions confounded) extracted from a pre-trained convolutional neural network using the maximum likelihood estimator (empirical mean and empirical covariance matrix). From the eigendecomposition of the empirical covariance matrix, a whitening transformation is applied to the feature vectors, which yields a new vector of the same number of "components" (cf. [2]). This dataset shows visualizations of the component maps (cf. [2]) of all the categories in the MVTec-AD dataset. All the layer blocks [3] from a ResNet18 [4] are used as feature map extractors for the model described above. Each combination of category and layer is trained separately. The visualizations are composed of heatmaps of the square of the component maps; their sum (the Mahalanobis distance of each pixel, yielding an anomaly score map), and (for images containing defects, a.k.a. "anomalies") the ground truth annotation provided in the MVTec-AD dataset. For more details, refer to our paper. --- [1] Pixel-wise feature vectors: given a stacked feature map of a 2D image (i.e. multiple 2D images, or "channels," containing information about the 2D input image), extract the values of all channels for a single pixel position. [2] The axes of the whitened [feature] vectors are referred to as "components" instead of "features" as a reminder that they come from a projection of the feature vectors along the eigenvectors, or eigen-components, of the covariance matrix. As the whitened vectors are in the same 2D grid structure as their original feature maps, we refer to them as "component maps." [3] "layer1", "layer2", "layer3", "layer4". [4] From torchvision version "0.15.2", model weights "ResNet18_Weights.IMAGENET1K_V1". TODO ref to paper
创建时间:
2023-06-28



