Replication Data for: Time-IMM: A Dataset and Benchmark for Irregular Multimodal Multivariate Time Series
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://doi.org/10.7910/DVN/EPWVWX
下载链接
链接失效反馈官方服务:
资源简介:
Time-IMM is a curated collection of nine real-world, irregular, multimodal multivariate time-series datasets designed to reflect the diverse sampling mechanisms encountered in practice. Each sub-dataset exemplifies one of three cause-driven irregularity categories—trigger-based (e.g., event-driven logging in GDELT), constraint-based (e.g., market-hours sampling in FNSPID), and artifact-based (e.g., missing-data gaps in ILINet)—and spans domains such as healthcare (MIMIC), climate monitoring (EPA-Air), network telemetry (CESNET), and social sensing (StudentLife). Every series is paired with asynchronous textual annotations (e.g., commit messages, clinical notes, news headlines) to support realistic multimodal fusion. Alongside the data, we provide the **IMM-TSF** benchmark library, which implements modular timestamp-to-text and multimodality-fusion layers for forecasting under irregular sampling. Time-IMM enables systematic evaluation of models’ robustness to real-world irregularities and their ability to leverage asynchronous text, making it an ideal foundation for advancing the state of the art in time-series forecasting, anomaly detection, and beyond. Access note: All sub-datasets in Time-IMM are publicly accessible via direct download links, except MIMIC. Due to NIH and IRB restrictions, MIMIC remains a restricted-access resource. To reproduce our experiments on MIMIC you must: 1. Be a credentialed user 2. Complete the required training (e.g., CITI Data or Specimens Only Research) 3. Sign the official MIMIC data use agreement We therefore include only the preprocessing code for MIMIC in our repository, which you can run once you obtain the raw files through the standard MIMIC access process. License: CC BY 4.0
创建时间:
2025-05-16



