Dataset for Large Language Models Classification of Astronomical Transient : Survey Images, Labels, and Zooniverse Results
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14561777
下载链接
链接失效反馈官方服务:
资源简介:
This dataset supports the study on the application of Large Language Models (LLMs) to astronomical transient classification. It contains data from three wide-field optical surveys: Pan-STARRS, MeerLICHT, and ATLAS, and includes the following components:
Survey Image Files
Image triplets (New, Reference, and Difference images) for each transient candidate in the three datasets.
Images are organized by survey (Pan-STARRS, MeerLICHT, and ATLAS).
Survey Label Files
Ground-truth classification labels for each candidate in the three surveys.
Labels include categories such as Real (e.g., transients and variable stars for MeerLICHT, and only explosive transients for Pan-STARRS and ATLAS) and Bogus (various types of artifacts), as determined by professional astronomers.
Zooniverse Classification Results
Results from the Zooniverse campaign evaluating the quality of Gemini’s outputs.
Includes responses from professional astronomers who rated the coherence of Gemini's classifications and explanations on a 0–5 scale.
Purpose:
The dataset is intended for researchers interested in exploring:
The use of LLMs for transient classification.
Ground-truth labels and their application in astronomical classification tasks.
Human evaluation of AI-generated outputs in astronomy.
Use and Citation:
Please cite this dataset using the DOI provided by Zenodo if used in your research. For details on the methodology, refer to the associated publication: "Large Language Models Enable Textual Interpretation of Image-Based Astronomical Transient Classifications", Under Review, 2025
创建时间:
2025-01-20



