dublin_bus_delays_with_weather
收藏Zenodo2026-04-30 更新2026-05-26 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.19922474
下载链接
链接失效反馈官方服务:
资源简介:
Upload: The Early-Stage Parquet Dataset
Title: Early-Stage Dublin Bus Delay Dataset (Filtered) Version: 1.0 Keywords: Public Transit, Time Series, Dublin Bus, Data Engineering, Intermediate Dataset
Description:
Overview This dataset contains 2,190,678 rows of early-stage public transit data specific to the Dublin Bus network. The data spans a specific operational timeframe, beginning on March 18, 2026, and ending on April 03, 2026.
Data Schema (Features) The dataset includes the following core operational and meteorological columns:
scrape_timestamp: The exact date and time the data was recorded.
trip_id: The unique identifier for the specific bus journey.
route_id: The identifier for the bus route.
stop_id: The identifier for the specific bus stop.
delay_seconds: The target variable representing the arrival delay in seconds.
rain_mm: Localized rainfall measurement in millimeters.
visibility_m: Localized visibility measurement in meters.
Data Lineage & Processing This file represents an intermediate stage in the data engineering pipeline for a Master's thesis. It was extracted from the raw national Transport for Ireland (TFI) dataset and strictly filtered to isolate the 'Dublin Bus' operator. While it has been converted to the highly efficient Parquet format for faster cloud streaming, it represents the data prior to the final advanced feature engineering and 3D tensor reshaping required for Deep Learning architectures.
Usage in Thesis This dataset is provided to demonstrate the intermediate data processing phase. It is intended to be streamed into the data engineering notebook to prove the methodology of moving from raw national data to the final machine learning features, without forcing the execution of the full 4.5GB national CSV file.
提供机构:
Zenodo
创建时间:
2026-04-30



