five

kylianmallet/openflights

收藏
Hugging Face2025-12-09 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/kylianmallet/openflights
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: en license: odbl pretty_name: OpenFlights - Complete Dataset tags: - aviation - airports - routes - airlines configs: - config_name: airlines data_files: - split: default path: "airlines/data.parquet" - config_name: airports data_files: - split: default path: "airports/data.parquet" - config_name: airports_extended data_files: - split: default path: "airports_extended/data.parquet" - config_name: countries data_files: - split: default path: "countries/data.parquet" - config_name: planes data_files: - split: default path: "planes/data.parquet" - config_name: routes data_files: - split: default path: "routes/data.parquet" --- <a name="readme-top"></a> <!-- PROJECT LOGO --> <br /> <div align="center"> <a href="https://openflights.org/data"> <img src="https://openflights.org/img/icon_favicon.png" alt="OpenFlights Logo" width="90" height="90"> </a> <h1 align="center">OpenFlights - Complete Dataset</h1> <p align="center"> A fully structured and Parquet-optimized version of the OpenFlights open aviation database. <br /> <i>This dataset is redistributed for convenience. I am not the owner of the data. All source data is originally provided by OpenFlights.</i> <br /> <a href="https://openflights.org/data">OpenFlights (Website)</a> ⸱ <a href="mailto:kylian.mallet@sklav.group">Contact Me</a> </p> </p> </div> --- <details> <summary>Table of Contents</summary> <ol> <li><a href="#about-the-dataset">About the Dataset</a></li> <li><a href="#datasets-structure">Datasets Structure</a></li> <li><a href="#schema">Schema</a></li> <li><a href="#usage">Usage</a></li> <li><a href="#licensing--attribution">Licensing & Attribution</a></li> <li><a href="#source">Source</a></li> </ol> </details> ## About the Dataset This repository provides the OpenFlights database exported into **columnar Parquet format**, enabling efficient analysis, filtering, joins, and machine learning workflows. It is designed as a multi-configuration dataset, where each table is stored as its own configuration with a single split named `default`. Included data: - Airports (basic + extended metadata) - Airlines - Routes between airports - Aircraft models (planes) - Countries with ISO codes All Null/OpenFlights `\N` placeholders were converted to standard `NaN` values for clean schema typing. <p align="right">(<a href="#readme-top">back to top</a>)</p> ## Datasets Structure This dataset contains **6 subsets** : | Configurations | Description | |-----------|-------------| | `airports` | All known airport infrastructure worldwide | | `airports_extended` | Enhanced airport metadata (time zones, DST, categories, etc.) | | `airlines` | Commercial & private airlines with active status | | `routes` | Direct flight connections (airline + equipment info) | | `planes` | Aircraft types and codes | | `countries` | Country reference table (ISO 3166-1) | Note : Each configuration has a single split: `default`. Designed for relational joins using identifiers: - `airlines.AirlineID` ↔ `routes.AirlineID` - `airports.AirportID` ↔ `routes.SourceAirportID / DestinationAirportID` <p align="right">(<a href="#readme-top">back to top</a>)</p> ## Schema ### Airports - AirportID (int) - Name (string) - City (string) - Country (string) - IATA (string, nullable) - ICAO (string, nullable) - Latitude (float) - Longitude (float) - Altitude (int) - Timezone (float) - DST (string) - TzDBTimezone (string) ### Airports Extended > Same columns as `airports`, but typically higher data completeness and accuracy - AirportID (int) - Name (string) - City (string) - Country (string) - IATA (string, nullable) - ICAO (string, nullable) - Latitude (float) - Longitude (float) - Altitude (int) - Timezone (float) - DST (string) - TzDBTimezone (string) ### Airlines - AirlineID (int) - Name (string) - Alias (string, nullable) - IATA (string, nullable) - ICAO (string, nullable) - Callsign (string, nullable) - Country (string, nullable) - Active (string: "Y" / "N") ### Routes - Airline (string) - AirlineID (int, nullable) - SourceAirport (string) - SourceAirportID (int, nullable) - DestinationAirport (string) - DestinationAirportID (int, nullable) - Codeshare (string, nullable) - Stops (int) - Equipment (string) ### Planes > Referenced in routes using Equipment codes. - PlaneID (int) - IATA (string, nullable) - ICAO (string, nullable) - Name (string) Notes: - Some aircraft may only have IATA or ICAO codes - PlaneID is an OpenFlights internal identifier ### Countries > Used as reference for airports and airlines - Name (string) - CountryCode (string, nullable - ISO 3166-1 format) Notes: - Not all countries have official codes available - Used for international standard alignment <p align="right">(<a href="#readme-top">back to top</a>)</p> ## Usage Load directly with 🤗 Datasets: ```python from datasets import load_dataset airports = load_dataset("kylianmallet/openflights", name="airports").get("default") routes = load_dataset("kylianmallet/openflights", name="routes").get("default") print(airports.head()) ``` Efficient joins using pandas / Polars / Apache Arrow: ```python import polars as pl from datasets import load_dataset d_airlines = load_dataset("kylianmallet/openflights", name="airlines").get("default") d_routes = load_dataset("kylianmallet/openflights", name="routes").get("default") airlines = pl.from_arrow(d_airlines.data.table) routes = pl.from_arrow(d_routes.data.table) joined = routes.join( airlines, left_on="AirlineID", right_on="AirlineID", how="left" ) print(joined.head()) ``` ## Licensing & Attribution This dataset is distributed under the **Open Database License (ODbL 1.0)**. > I do not own or claim ownership of this data. > Full copyright and database rights remain with **OpenFlights**. If using this dataset in research or products, attribution to OpenFlights is required. See: [https://opendatacommons.org/licenses/odbl/1-0/](https://opendatacommons.org/licenses/odbl/1-0/) ## Source Original raw data collected and maintained by: **OpenFlights** [https://openflights.org/data](https://openflights.org/data) Many thanks to the contributors who keep the dataset updated.
提供机构:
kylianmallet
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作