five

DeepAstroUDA: Semi-Supervised Universal Domain Adaptation for Cross-Survey Galaxy Morphology Classification and Anomaly Detection

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/7473596
下载链接
链接失效反馈
官方服务:
资源简介:
We present the data used in "DeepAstroUDA: Semi-Supervised Universal Domain Adaptation for Cross-Survey Galaxy Morphology Classification and Anomaly Detection". It was also used in the conference paper presented in Machine Learning and the Physical Sciences workshop at NeurIPS 2022: "Semi-Supervised Domain Adaptation for Cross-Survey Galaxy Morphology Classification and Anomaly Detection". A plethora of AI methods, has already shown huge promise in increasing quality and speed of work with astronomical datasets, but high complexity of AI methods leads to extraction of dataset-specific non-robust features, which leads to models that cannot work on multiple datasets at the same time. We develop a Universal Domain Adaptation method DeepAstroUDA, capable of performing semi-supervised domain adaptation, that can be applied to datasets with different data distributions and class overlap. Extra classes can be present in any of the two datasets, and the method can even be used in the presence of unknown classes. We apply our model to three examples of galaxy morphology classification tasks of different complexities (3-class and 10-class problems), with anomaly detection i.e. in all our experiments we have one extra class in the unlabeled target dataset, which represents our anomaly class.   DATA: 1) DA across two different data releases of the same survey (LSST 1 and 10 years of observation): We use data from Ciprijanovic et al. 2022. which can also be found on Zenodoo: https://zenodo.org/record/5514180 . Data contains three classes: spiral (0), elliptical (1) and merging galaxies (3, anomaly class). 2) DA across two surveys (SDSS and DeCALS): We create datasets using data and labels from the Galaxy Zoo project. Datasets contain 10 classes (9 known classes present in both SDSS and DeCALS data, and one unknown anomaly class present only in DeCALS data): disturbed (0), merging (1), round smooth (2), cigar shaped smooth (3), barred spiral (4), unbarred tight spiral (5), unbarred loose spiral (6), edge-on without bulge (7), edge-on with bulge (8), lenses (9, unknown anomaly class). SDSS (wide filed): datasets is split into two files  - sdss_1.h5, sdss_2.h5 DeCALS:  decals.zip 3) DA between wide and deep observing fields of the same survey (SDSS): We create datasets using data and labels from the Galaxy Zoo project. Datasets contain same 10 classes as in 2), with the final lens anomaly class being only present in the SDSS deep field. SDSS (wide filed): the same data as in 2) SDSS (Strip 82 deep field): sdss_stripe82.zip All SDSS and DECaLS files contain full datasets (train, validation and test). Exact split that we performed (0.6 : 0.2 : 0.2) can be done using the code that accompanies this publication: https://github.com/deepskies/DeepAstroUDA .
创建时间:
2023-07-07
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作