five

Grid2Text Dataset Aligning Meteorological Grid Data with Expert Reasoning Chains

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://doi.org/10.7910/DVN/XT1X89
下载链接
链接失效反馈
官方服务:
资源简介:
Grid2Text is a comprehensive alignment dataset designed to bridge the gap between meteorological grid data and expert textual forecasts[cite: 1, 9]. This dataset addresses the challenge of interpreting high-dimensional numerical weather prediction (NWP) data by providing one-to-one correspondences between structured grid features and expert-written weather reports, augmented with explicit Chain-of-Thought (CoT) reasoning paths. Data Content & Structure: Source Data: Derived from ERA5 Reanalysis Data (ECMWF), covering the Shanghai region from 2020 to 2022. Core Variables: Includes 10 core meteorological variables such as 10m wind components, total precipitation, temperature, and relative humidity. Reasoning Layer: Distinct from traditional datasets, Grid2Text includes a "Chain-of-Thought" (CoT) component that captures the intermediate reasoning steps of forecasters (e.g., wind vector analysis, temperature trend judgment). File Structure: The dataset is organized into three main directories: feature_data/: CSV files containing structured spatiotemporal aggregated features (e.g., max_temp_c, ifrain, wind direction). chain_of_thought/: TXT files containing the step-by-step expert reasoning process used to derive the forecast. forecasts/: TXT files containing the final, operational-standard natural language weather forecast. Methodology: The dataset was constructed using a "Human-in-the-loop" workflow, employing a hybrid strategy of Large Language Model (LLM) generation followed by rigorous multi-round verification by senior meteorological forecasters to ensure physical accuracy and logical consistency.
创建时间:
2025-12-17
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作