five

MOSAIC-CONUS: A Multimodal, Multi-Temporally Paired Dataset for Earth Sciences

收藏
DataCite Commons2026-05-08 更新2026-05-10 收录
下载链接:
https://www.osti.gov/servlets/purl/3005125
下载链接
链接失效反馈
官方服务:
资源简介:
Earth embeddings—vector representations of geographic locations indexed in space and time—are emerging as a unifying interface for geospatial AI. However, their quality depends not only on model design, but on how multimodal Earth observation (EO) data are spatially indexed, temporally aligned, and cross-modally associated during pretraining. We introduce MOSAIC-CONUS (Multimodal Observations with Spatially Aligned Imagery, Urban Points of Interest, In-Situ Measurements and Text Captions), a large-scale EO dataset over the contiguous United States, organized around 250,000 stratified point indices that serve as stable spatial keys across seven modalities: active radar, passive optical imagery, lidar-derived elevation, land cover, functional context, hydrometeorological measurements, and textual summaries. Unlike existing EO datasets, MOSAIC-CONUS introduces four contributions not jointly addressed in prior work: 1. an open-source, large-scale multimodal EO corpus structured around point-indexed data designed to support Earth embedding learning; 2. explicit radar-optical pairing tables spanning twelve temporal alignment regimes, formalizing cross-sensor alignment as a controllable variable for analyzing how temporal mismatch across modalities influences learned embeddings quality; 3. a benchmark suite spanning cross-modal retrieval, annual nightlights regression, and basin-held-out streamflow prediction, positioning MOSAIC-CONUS as a benchmark-ready resource for multimodal AI systems; and 4. a language-based embedding layer through co-registered textual summaries, enabling Earth embeddings to function as a queryable interface for agentic AI systems. The dataset and pairing protocols are publicly released.
提供机构:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
创建时间:
2026-05-08
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作