Implications of Peak Selection in the Interpretation of Unsupervised Mass Spectrometry Imaging Data Analyses
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://figshare.com/articles/dataset/Implications_of_Peak_Selection_in_the_Interpretation_of_Unsupervised_Mass_Spectrometry_Imaging_Data_Analyses/13519101
下载链接
链接失效反馈官方服务:
资源简介:
Mass spectrometry imaging can produce
large amounts of complex
spectral and spatial data. Such data sets are often analyzed with
unsupervised machine learning approaches, which aim at reducing their
complexity and facilitating their interpretation. However, choices
made during data processing can impact the overall interpretation
of these analyses. This work investigates the impact of the choices
made at the peak selection step, which often occurs early in the data
processing pipeline. The discussion is done in terms of visualization
and interpretation of the results of two commonly used unsupervised
approaches: t-distributed stochastic neighbor embedding
and k-means clustering, which differ in nature and
complexity. Criteria considered for peak selection include those based
on hypotheses (exemplified herein in the analysis of metabolic alterations
in genetically engineered mouse models of human colorectal cancer),
particular molecular classes, and ion intensity. The results suggest
that the choices made at the peak selection step have a significant
impact in the visual interpretation of the results of either dimensionality
reduction or clustering techniques and consequently in any downstream
analysis that relies on these. Of particular significance, the results
of this work show that while using the most abundant ions can result
in interesting structure-related segmentation patterns that correlate
well with histological features, using a smaller number of ions specifically
selected based on prior knowledge about the biochemistry of the tissues
under investigation can result in an easier-to-interpret, potentially
more valuable, hypothesis-confirming result. Findings presented will
help researchers understand and better utilize unsupervised machine
learning approaches to mine high-dimensionality data.
创建时间:
2021-01-04



