five

claritystorm/dot-airline-ontime

收藏
Hugging Face2026-04-01 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/claritystorm/dot-airline-ontime
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: other license_name: public-domain task_categories: - tabular-classification - tabular-regression tags: - aviation - airline - on-time-performance - flight-delays - bts - dot - united-states - machine-learning pretty_name: DOT Airline On-Time Performance 2018–Present size_categories: - 10M<n<100M --- # DOT Airline On-Time Performance 2018–Present **35M+ domestic flights (2018–2024)** — BTS Reporting Carrier On-Time Performance (USDOT Form 41), unified from 84 monthly files into a single analysis-ready table. Departure/arrival delays, cancellation codes, delay cause breakdowns (carrier/weather/NAS/security/late aircraft), taxi times, and aircraft routing. | 📊 Records | 📅 Coverage | 🏷️ License | 🔄 Updated | |-----------|-------------|-----------|-----------| | 35M+ flights | 2018–2024 (84 months) | Public Domain | Annual | **This repo contains a free 1,000-row sample.** Full dataset (CSV + Parquet + year-partitioned Parquet) → **[claritystorm.com/datasets/dot-airline-ontime](https://claritystorm.com/datasets/dot-airline-ontime)** --- ## Quick Start ```python from datasets import load_dataset import pandas as pd # Load the 1,000-row sample ds = load_dataset("claritystorm/dot-airline-ontime") df = ds["train"].to_pandas() # On-time rate by carrier carrier_perf = ( df[df["cancelled"] == 0] .groupby("carrier")["is_delayed"] .agg(flights="count", delayed="sum") .assign(delay_rate=lambda x: (x["delayed"] / x["flights"] * 100).round(1)) .sort_values("delay_rate") ) print(carrier_perf) ``` ## Use Cases - **Flight delay prediction** — 35M+ labeled examples for ML models - **Airline benchmarking** — on-time performance by carrier, route, airport, and season - **COVID-19 aviation impact** — study collapse and recovery 2020–2024 - **Airport operations research** — NAS and weather delay propagation through hubs - **Insurance & risk pricing** — delay distributions for flight delay insurance models - **Travel product optimization** — connection time recommendations and flight-risk scoring ## Schema (selected fields) | Field | Type | Description | |-------|------|-------------| | flight_date | date | Date of flight (YYYY-MM-DD) | | carrier | string | IATA carrier code (AA, DL, WN, etc.) | | origin | string | Origin airport IATA code | | dest | string | Destination airport IATA code | | route | string | Route key (e.g. JFK-LAX) — computed | | dep_delay | float | Departure delay in minutes | | arr_delay | float | Arrival delay in minutes | | is_delayed | int | 1 if arr_delay_minutes ≥ 15 and not cancelled — computed | | cancelled | int | 1 if flight was cancelled | | cancellation_reason | string | carrier / weather / national_air_system / security — computed | | carrier_delay | float | Delay minutes attributable to carrier | | weather_delay | float | Delay minutes attributable to weather | | nas_delay | float | Delay minutes attributable to NAS | | late_aircraft_delay | float | Delay from late incoming aircraft | ## ⬇️ Get the Full Dataset | Tier | Price | Includes | |------|-------|----------| | Sample | Free | 1,000 rows, Public Domain (this repo) | | Complete | $99 | Full 35M+ flights, CSV + Parquet + year-partitioned Parquet | | Annual | $199/yr | Complete + annual updates as BTS releases new monthly data | 👉 **[Purchase at claritystorm.com/datasets/dot-airline-ontime](https://claritystorm.com/datasets/dot-airline-ontime)** ## Source **Bureau of Transportation Statistics (BTS)**, US Department of Transportation — Reporting Carrier On-Time Performance (Form 41 Traffic). Under 14 CFR Part 234, US carriers with ≥1% of domestic scheduled service must report monthly. Source data is US federal government work in the **public domain** (17 U.S.C. 105). Unified, typed, and enriched by [ClarityStorm Data](https://claritystorm.com).
提供机构:
claritystorm
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作