five

core-sdo

收藏
魔搭社区2025-12-05 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/nasa-ibm-ai4science/core-sdo
下载链接
链接失效反馈
官方服务:
资源简介:
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6488f1d3e22a0081a561ec8f/pmQLbUWrXhSGMBhyejhCn.png) # ML-Ready Multi-Modal Image Dataset from SDO ## Overview This dataset provides machine learning (ML)-ready solar data curated from NASA’s Solar Dynamics Observatory (SDO), covering observations from **May 13, 2010, to Dec 31, 2024**. It includes Level-1.5 processed data from: **Atmospheric Imaging Assembly (AIA)** and **Helioseismic and Magnetic Imager (HMI)**. The dataset is designed to facilitate large-scale learning applications in heliophysics, such as space weather forecasting, unsupervised representation learning, and scientific foundation model development. --- ## Dataset Structure **Data Variables:** ```text - aia94 (y, x) float32 : AIA 94 Å - aia131 (y, x) float32 : AIA 131 Å - aia171 (y, x) float32 : AIA 171 Å - aia193 (y, x) float32 : AIA 193 Å - aia211 (y, x) float32 : AIA 211 Å - aia304 (y, x) float32 : AIA 304 Å - aia335 (y, x) float32 : AIA 335 Å - aia1600 (y, x) float32 : AIA 1600 Å (UV continuum) - hmi_m (y, x) float32 : HMI LOS Magnetogram - hmi_bx (y, x) float32 : HMI Magnetic Field - x component - hmi_by (y, x) float32 : HMI Magnetic Field - y component - hmi_bz (y, x) float32 : HMI Magnetic Field - z component - hmi_v (y, x) float32 : HMI Doppler Velocity ``` ## Dataset Details | Field | Description | |------------------------|---------------------------------------------| | **Temporal Coverage** | May 13, 2010 – Dec 31, 2024 | | **Data Format** | netCDF (`.nc`), float32 | | **Temporal Granularity**| 12 minutes | | **Data Shape** | `[13, 4096, 4096]` per file | | **Channels** | 13 total (AIA EUV ×8 + HMI magnetograms ×5) | | **Size per File** | ~570 MB | | **Total Size** | ~360TB | --- ## Notes Training Data for 1 month is available on Huggingface parallel to the main branch as Parquet files. The full dataset is located in AWS S3 buckets. Note that full dataset is over 360 TB. Users can see the full list of files using the below command. ```bash aws s3 ls s3://nasa-surya-bench --no-sign-request ``` To download the individual files from aws S3 buckets, they can utilize the tutorial in the link [https://docs.aws.amazon.com/AmazonS3/latest/userguide/download-objects.html]. There are different options available for downloading and syncing data from S3. Authors Sujit Roy, Dinesha Vasanta Hegde, Johannes Schmude, Amy Lin, Vishal Gaur, Talwinder Singh, Rohit Lal corr: sujit.roy@nasa.gov

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6488f1d3e22a0081a561ec8f/pmQLbUWrXhSGMBhyejhCn.png) # 来自美国国家航空航天局(NASA)太阳动力学天文台(SDO)的机器学习就绪多模态图像数据集 ## 概述 本数据集整理自美国国家航空航天局(National Aeronautics and Space Administration,NASA)太阳动力学天文台(Solar Dynamics Observatory,SDO)的观测数据,为已适配机器学习(Machine Learning,ML)的太阳观测数据集,观测时间覆盖2010年5月13日至2024年12月31日。数据集包含经1.5级预处理的**大气成像组件(Atmospheric Imaging Assembly,AIA)**与**日震与磁像仪(Helioseismic and Magnetic Imager,HMI)**的观测数据。 本数据集旨在推动日物理学领域的大规模学习应用,例如空间天气预报、无监督表征学习以及科学基础大语言模型(Large Language Model,LLM)的研发。 --- ## 数据集结构 ### 数据变量 text - aia94 (y, x) float32 : AIA 94埃波段数据 - aia131 (y, x) float32 : AIA 131埃波段数据 - aia171 (y, x) float32 : AIA 171埃波段数据 - aia193 (y, x) float32 : AIA 193埃波段数据 - aia211 (y, x) float32 : AIA 211埃波段数据 - aia304 (y, x) float32 : AIA 304埃波段数据 - aia335 (y, x) float32 : AIA 335埃波段数据 - aia1600 (y, x) float32 : AIA 1600埃波段(紫外连续谱)数据 - hmi_m (y, x) float32 : HMI视向磁像图数据 - hmi_bx (y, x) float32 : HMI磁场x分量数据 - hmi_by (y, x) float32 : HMI磁场y分量数据 - hmi_bz (y, x) float32 : HMI磁场z分量数据 - hmi_v (y, x) float32 : HMI多普勒速度数据 ## 数据集详情 | 字段 | 说明 | |------------------------|---------------------------------------------| | **时间覆盖范围** | 2010年5月13日 — 2024年12月31日 | | **数据格式** | 网络通用数据格式(Network Common Data Form,netCDF,后缀为.nc),采用32位浮点型存储 | | **时间粒度** | 12分钟 | | **数据形状** | 单文件数据维度为 `[13, 4096, 4096]` | | **通道数** | 总计13个通道(8个AIA极紫外波段数据 + 5个HMI磁像类数据) | | **单文件大小** | 约570 MB | | **总数据体量** | 约360 TB | --- ## 补充说明 1个月的训练数据集以Parquet格式文件存储于Hugging Face的主分支并行目录中。完整数据集存储于AWS S3存储桶内,需注意完整数据集体量超过360 TB。用户可通过以下命令查看全部文件列表: bash aws s3 ls s3://nasa-surya-bench --no-sign-request 若需从AWS S3存储桶下载单个文件,可参考AWS官方教程[https://docs.aws.amazon.com/AmazonS3/latest/userguide/download-objects.html],目前提供了多种从S3下载或同步数据的方案。 ### 作者信息 作者:Sujit Roy、Dinesha Vasanta Hegde、Johannes Schmude、Amy Lin、Vishal Gaur、Talwinder Singh、Rohit Lal 通讯邮箱:sujit.roy@nasa.gov
提供机构:
maas
创建时间:
2025-10-22
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作