five

ar_emergence

收藏
魔搭社区2025-12-05 更新2025-09-13 收录
下载链接:
https://modelscope.cn/datasets/nasa-ibm-ai4science/ar_emergence
下载链接
链接失效反馈
官方服务:
资源简介:
# Active Region Emergence Dataset ## Dataset Summary The **Active Region Emergence Dataset** is designed to support research on the early detection of solar Active Regions (ARs) and the development of predictive models for space weather. By characterizing the evolution of ARs before, during, and after their emergence, the dataset enables studies of pre-emergence signatures and early warning methods. This dataset is derived from NASA’s **Solar Dynamics Observatory (SDO)** using measurements from the **Helioseismic and Magnetic Imager (HMI)**. It includes timeline data of: - **Acoustic power** (from Doppler velocity maps) - **Photospheric magnetic field** - **Continuum intensity** for **59 large ARs** that emerged on the visible solar disk between **2010 and 2023**. Each AR is tracked within a **30° × 30° patch** over multiple days. These data products have already been applied successfully in machine learning models for AR emergence forecasting in [1]. ## Supported Tasks and Applications - **Spatio-temporal Forecasting**: Forecasting the spatially resolved emergence characteristics of active regions. ## Data Structure ### Data Files The repository contains csv files containing paths and compressed data files for emergence characteristics. The dataset is split for consistent training, validation and testing. - `train.csv` — training split (37 ARs) - `valid.csv` — validation split (9 ARs) - `test.csv` — test split (13 ARs) - `README.md` — dataset description - `data.zip` — compressed folder containing all Active Region (AR) subfolders: - `AR{noaa_num}/` - `mean_int{noaa_num}_flat.npz` → continuum intensity timeline - `mean_mag{noaa_num}_flat.npz` → magnetic field timeline - `mean_pmdop{noaa_num}_flat.npz` → 4 acoustic power timeline (2–3, 3–4, 4–5, 5–6 mHz). - Index 0: 2-3 mHz - Index 1: 3-4 mHz - Index 2: 4-5 mHz - Index 3: 5-6 mHz where `noaa_num` is the NOAA Active Region Number for the active regions in this dataset. ### Features CSV files include following features: - **`AR`**: NOAA Active Region number - **`t_start`**: Start time of tracked patch - **`t_end`**: End time of tracked patch - **`mean_int_path`**: Path to continuum intensity `.npz` file - **`mean_mag_path`**: Path to magnetic field `.npz` file - **`mean_pmdop_path`**: Path to acoustic power `.npz` file ## Dataset Details | Field | Description | |------------------------|---------------------------------------------| | **Temporal Coverage** | 2010 – 2023 | | **Data Format** | CSV for metadata / NPZ for rasters | | **Data Size** | Total 59 instances | | **Total File Size** | ~23.1 MB | | **Data Shape** | (6, 240, 9, 9) per instance | | **Cadence** | 1 hour | ## Example Usage ```python import numpy as np import pandas as pd # Load CSV metadata df = pd.read_csv("train.csv") print(df.head()) # Load one AR’s power map dopplergram data # for intensity and magnetic flux use mean_int_path and mean_mag_path column names on df sample_path = df.iloc[0]["mean_pmdop_path"] # Update to local path after unzipping data.zip sample_path = sample_path.replace("/data", "data") data = np.load(sample_path) print("Keys in npz file:", data.files) print("Data shape:", data[data.files[0]].shape) ``` ## Contact [1] Spyros Kasapis [skasapis@princeton.edu](mailto:skasapis@princeton.edu) ## References [1] Kasapis, S., Kitiashvili, I. N., Kosovichev, A. G. & Stefan, J. T. Prediction of intensity variations associated with emerging active regions using helioseismic power maps and machine learning. The Astrophys. J. Suppl. Ser. 10.3847/1538-4365/adfbe2 (2025)

# 活动区出现数据集(Active Region Emergence Dataset) ## 数据集概述 **活动区出现数据集(Active Region Emergence Dataset)** 旨在支撑太阳活动区(Active Regions, ARs)早期检测以及空间天气预测模型研发相关研究。通过表征活动区在出现前、出现过程中及出现后的演化过程,该数据集可用于研究活动区出现前的特征信号与早期预警方法。 本数据集源自美国国家航空航天局(National Aeronautics and Space Administration, NASA)的太阳动力学天文台(Solar Dynamics Observatory, SDO),采用其搭载的日震与磁像仪(Helioseismic and Magnetic Imager, HMI)的观测数据,涵盖2010年至2023年间在可见太阳盘面出现的59个大型活动区的时序数据,包含以下三类数据: - 声学功率(源自多普勒速度图) - 光球层磁场 - 连续谱强度 每个活动区会在30°×30°的视场 patch 中被连续追踪多日。上述数据产品已成功应用于参考文献[1]中的活动区出现预测机器学习模型。 ## 支持的任务与应用场景 - **时空预测**:对活动区的空间分辨出现特征进行预测。 ## 数据结构 ### 数据文件 本数据集仓库包含用于记录路径的CSV文件以及压缩的活动区特征数据文件,且已按照标准划分为训练集、验证集与测试集: - `train.csv` — 训练集(包含37个活动区) - `valid.csv` — 验证集(包含9个活动区) - `test.csv` — 测试集(包含13个活动区) - `README.md` — 数据集说明文档 - `data.zip` — 包含所有活动区子文件夹的压缩包: - `AR{noaa_num}/`:其中`noaa_num`为该活动区的NOAA活动区编号 - `mean_int{noaa_num}_flat.npz`:连续谱强度时序数据文件 - `mean_mag{noaa_num}_flat.npz`:磁场时序数据文件 - `mean_pmdop{noaa_num}_flat.npz`:4频段声学功率时序数据(2–3、3–4、4–5、5–6 mHz),其中: - 索引0对应2–3 mHz频段 - 索引1对应3–4 mHz频段 - 索引2对应4–5 mHz频段 - 索引3对应5–6 mHz频段 ### 特征字段 CSV元数据文件包含以下特征字段: - **`AR`**:NOAA活动区编号 - **`t_start`**:追踪视场的起始时间 - **`t_end`**:追踪视场的结束时间 - **`mean_int_path`**:连续谱强度`.npz`文件的存储路径 - **`mean_mag_path`**:磁场`.npz`文件的存储路径 - **`mean_pmdop_path`**:声学功率`.npz`文件的存储路径 ## 数据集详情 | 字段 | 说明 | |------------------------|---------------------------------------------| | **时间覆盖范围** | 2010年 – 2023年 | | **数据格式** | 元数据采用CSV格式,栅格数据采用NPZ格式 | | **数据实例总数** | 共59个 | | **总文件大小** | 约23.1 MB | | **数据维度** | 每个实例的形状为(6, 240, 9, 9) | | **采样间隔** | 每1小时一组数据 | ## 示例用法 python import numpy as np import pandas as pd # 加载CSV元数据 df = pd.read_csv("train.csv") print(df.head()) # 加载单个活动区的功率图多普勒数据 # 若需加载强度与磁通量数据,可使用DataFrame中的mean_int_path与mean_mag_path字段 sample_path = df.iloc[0]["mean_pmdop_path"] # 解压data.zip后更新为本地路径 sample_path = sample_path.replace("/data", "data") data = np.load(sample_path) print("Keys in npz file:", data.files) print("Data shape:", data[data.files[0]].shape) ## 联系方式 [1] Spyros Kasapis [skasapis@princeton.edu](mailto:skasapis@princeton.edu) ## 参考文献 [1] Kasapis, S., Kitiashvili, I. N., Kosovichev, A. G. & Stefan, J. T. 基于日震功率图与机器学习的新兴活动区相关强度变化预测. The Astrophys. J. Suppl. Ser. 10.3847/1538-4365/adfbe2 (2025)
提供机构:
maas
创建时间:
2025-08-21
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作