five

Euclid Quick Data Release (Q1) A first look at a multimodal autoregressive foundation model for exploring galaxy properties

收藏
DataCite Commons2025-09-29 更新2026-05-03 收录
下载链接:
http://dataverse.jpl.nasa.gov/citation?persistentId=doi:10.48577/jpl.PIHARS
下载链接
链接失效反馈
官方服务:
资源简介:
Modern astronomical surveys, such as the Euclid mission, produce high-dimensional, multi-modal data sets comprising imaging and spectroscopicinformation for millions of galaxies. These data sets represent an ideal benchmark for large, pre-trained, multi-modal models, offering vast amountsof unlabelled data. In this work, we present a first exploration of Euclid data with AstroPT, an autoregressive Foundation Model trained on∼ 300,000 optical and infrared Euclid images from the Q1 data release. We evaluate the benefits of self-supervised pre-training compared to fullysupervised training for tasks such as galaxy morphology classification, redshift estimation, similarity search, and outlier detection. Our findingsindicate that: (a) AstroPT embeddings are highly informative, correlating with morphology and effectively isolating outliers, (b) the addition ofinfrared data helps isolate stars but diminishes the identification of edge-on galaxies, which are better captured with optical images, (c) a simplefine-tuning of the embeddings for photometric redshift and stellar mass estimation improves accuracy, outperforming a fully supervised approachwith only ∼ 1% of the data, (d) that the addition of SED data into AstroPT via a simple multimodal token-chaining method leads to improvedresults in photo-z prediction compared to an AstroPT model pre-trained only on imagery data, and (e) anomaly detection and similarity searchalgorithms successfully identify interesting objects, such as ring galaxies and interacting galaxies.Key words. Methods: data analysis; Surveys; Galaxies: general
提供机构:
Root
创建时间:
2025-09-28
二维码
社区交流群
二维码
科研交流群
商业服务