juliensimon/spacex-launches
收藏Hugging Face2026-04-11 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/juliensimon/spacex-launches
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
pretty_name: "SpaceX Launch History"
language:
- en
description: >-
Complete SpaceX launch history from spacex.com — all 661 Falcon 9, Falcon Heavy,
Starship, and Falcon 1 missions with vehicle, site, status, landing, mission descriptions,
pre/post-launch timelines, and carousel photos.
size_categories:
- n<1K
task_categories:
- tabular-classification
tags:
- space
- spacex
- falcon-9
- falcon-heavy
- starship
- rocket-launch
- orbital-mechanics
- launch-history
- open-data
- tabular-data
- parquet
configs:
- config_name: launches
data_files:
- split: train
path: data/launches.parquet
default: true
- config_name: timelines
data_files:
- split: train
path: data/timelines.parquet
- config_name: carousel
data_files:
- split: train
path: data/carousel.parquet
---
# SpaceX Launch History
<div align="center">
<img src="banner.jpg" alt="An orbital sunrise illuminates the Earth's atmosphere, seen from the ISS" width="400">
<p><em>Credit: NASA</em></p>
</div>
*Part of the [Satellites & Launches Datasets](https://huggingface.co/collections/juliensimon/satellites-launches-datasets-67b4e0f9418e9f467c5e0e67) collection on Hugging Face.*
Complete record of every SpaceX launch from spacex.com — **661** missions
(2006–2026), including mission descriptions, pre/post-launch timelines,
and photo galleries.
## Dataset description
This dataset captures the full content of every launch page on [spacex.com/launches](https://www.spacex.com/launches),
spanning from the first Falcon 1 test flights through the latest Starship missions.
It includes structured timeline data for each launch phase (countdown, ascent,
stage separation, landing, payload deployment), mission descriptions, webcast references,
and carousel imagery.
The data is organized into three tables that can be joined on the `slug` field:
- **launches** — one row per mission with all metadata, descriptions, and derived fields
- **timelines** — one row per countdown/deployment event across all missions
- **carousel** — one row per photo with captions and image paths
SpaceX reports **636** total launches, **596** landings,
and **558** reflights as of the latest update.
## Schema — launches (661 rows)
| Column | Type | Description |
|--------|------|-------------|
| `id` | string | Internal SpaceX CMS (Strapi) identifier; opaque string, stable within the CMS |
| `document_id` | string | CMS document reference used for CMS versioning; opaque, not meaningful for analysis |
| `title` | string | Human-readable mission name from spacex.com (e.g. "Starlink Mission", "CRS-25", "Intuitive Machines-1") |
| `slug` | string | URL slug used as the primary key and join field across all three tables (e.g. "sl-10-22"); unique per mission |
| `mission_status` | string | Mission lifecycle state: "final" = completed and results confirmed, "upcoming" = not yet launched, "in-progress" = currently executing |
| `mission_type` | string | Mission category: "starlink" (Starlink constellation replenishment), "commercialSatellite" (third-party GEO/LEO satellite), "resupply" (ISS cargo), "nssl" (National Security Space Launch), "hsf" (human spaceflight), "rideshare" (SmallSat Rideshare), "science" (NASA/research payload), "starship" (Starship test flight) |
| `vehicle` | string | Launch vehicle variant: "Falcon 9", "Falcon Heavy", "Starship", or "Falcon 1" (retired) |
| `launch_site` | string | Launch complex and geographic location (e.g. "SLC-40, Florida", "LC-39A, Florida", "SLC-4E, California") |
| `launch_date` | date | UTC calendar date of launch (YYYY-MM-DD); null for upcoming missions without a confirmed date |
| `launch_time` | string | UTC launch time in HH:MM:SS format; null for unconfirmed upcoming launches |
| `return_site` | string | First-stage landing site (e.g. "LZ-1", "LZ-2", "JRTI" droneship, "OCISLY" droneship); null if no landing attempted or vehicle expended |
| `return_date_time` | string | Timestamp of first-stage return/landing (if available); null for expendable flights or when not recorded |
| `end_date` | date | Mission completion date (e.g. Dragon splashdown, satellite handoff); null if ongoing or not recorded |
| `end_time` | string | Mission completion time (UTC HH:MM:SS); null if not recorded |
| `direct_to_cell` | bool | True for Starlink Direct-to-Cell missions (satellites with cellular connectivity capability) |
| `is_live` | bool | True if the mission is currently streaming live; intended for real-time use; typically False in archived data |
| `description` | string | Full plaintext mission description sourced from spacex.com (HTML stripped); null for missions without a published description |
| `astronauts` | string | JSON array of crew member data (name, title, image) for crewed missions; null for uncrewed flights |
| `webcast_id` | string | Video identifier for the official launch webcast; null if no webcast was published |
| `webcast_platform` | string | Streaming platform hosting the webcast (e.g. "x.com", "youtube"); null if no webcast |
| `follow_dragon_enabled` | bool | True if real-time Dragon capsule tracking was available for this mission; null for non-Dragon missions |
| `launch_datetime` | datetime | Combined UTC launch datetime (date + time); null if launch_time is not available |
| `launch_year` | int | Calendar year of launch derived from launch_date; useful for time-series grouping; null if launch_date is null |
| `success` | bool | True if mission_status == "final" (mission completed); False for upcoming or in-progress; proxy for mission success |
| `has_landing` | bool | True if return_site is non-null, indicating a first-stage landing was recorded; does not distinguish success from failure |
## Schema — timelines (3,228 rows)
| Column | Type | Description |
|--------|------|-------------|
| `slug` | string | FK to launches.slug |
| `phase` | string | `pre_launch` or `post_launch` |
| `event_time` | string | Relative time (e.g. "00:01:12") |
| `description` | string | Event description (e.g. "Max Q") |
## Schema — carousel (292 rows)
| Column | Type | Description |
|--------|------|-------------|
| `slug` | string | FK to launches.slug |
| `caption` | string | Photo caption |
| `image_url` | string | Original CDN URL |
| `image_path` | string | Local path in dataset (e.g. `images/sl-10-22_0.jpg`) |
## Quick stats
- **661** total missions (655 completed, 5 upcoming)
- **Vehicles**: Falcon 9 (628), Starship (17), Falcon Heavy (11), Falcon 1 (5)
- **Top mission types**: starlink (381), commercialSatellite (116), resupply (39), nssl (37), hsf (22)
- **661** missions with landing data
- **640** missions with descriptions
- **3,228** timeline events across all missions
- **292** carousel photos
## Usage
```python
from datasets import load_dataset
# Load main launches table
launches = load_dataset("juliensimon/spacex-launches", "launches", split="train")
df = launches.to_pandas()
# Falcon 9 missions
f9 = df[df["vehicle"] == "Falcon 9"]
print(f"{len(f9):,} Falcon 9 launches")
# Launches by year
print(df.groupby("launch_year").size())
# Starlink missions
starlink = df[df["mission_type"] == "starlink"]
print(f"{len(starlink):,} Starlink missions")
# Load timelines and join
timelines = load_dataset("juliensimon/spacex-launches", "timelines", split="train")
tl = timelines.to_pandas()
# Get post-launch events for a specific mission
events = tl[(tl["slug"] == "sl-10-22") & (tl["phase"] == "post_launch")]
print(events[["event_time", "description"]])
# Load carousel
carousel = load_dataset("juliensimon/spacex-launches", "carousel", split="train")
photos = carousel.to_pandas()
print(f"{len(photos):,} photos across all missions")
```
## Data source
[spacex.com/launches](https://www.spacex.com/launches) — official SpaceX website.
Data sourced from the SpaceX content API (Strapi CMS).
## Update schedule
Daily incremental updates via GitHub Actions.
## Related datasets
- [launch-log](https://huggingface.co/datasets/juliensimon/space-launch-log) — McDowell launch log (all providers)
- [launch-cost](https://huggingface.co/datasets/juliensimon/launch-cost-to-leo) — Historical launch costs
- [launch-vehicles](https://huggingface.co/datasets/juliensimon/launch-vehicles) — Rocket specifications
- [starlink](https://huggingface.co/datasets/juliensimon/starlink-fleet-data) — Starlink constellation snapshots
## Pipeline
Source code: [juliensimon/space-datasets](https://github.com/juliensimon/space-datasets)
## Support
If you find this dataset useful, please give it a ❤️ on the [dataset page](https://huggingface.co/datasets/juliensimon/spacex-launches) and share feedback in the Community tab! Also consider giving a ⭐️ to the [space-datasets](https://github.com/juliensimon/space-datasets) repo.
## Citation
```bibtex
@dataset{spacex_launches,
author = {Simon, Julien},
title = {SpaceX Launch History},
year = {2026},
publisher = {Hugging Face},
url = {https://huggingface.co/datasets/juliensimon/spacex-launches},
note = {Sourced from spacex.com}
}
```
## License
[CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/) — Data sourced from spacex.com.
提供机构:
juliensimon
搜集汇总
数据集介绍

构建方式
在商业航天领域,数据集的构建往往依赖于权威机构的官方发布。本数据集通过自动化流程从SpaceX官方网站的API获取原始数据,涵盖了从首次发射至今的所有任务记录。数据经过清洗与结构化处理,被组织为三个相互关联的表格:核心发射记录表、任务时间线事件表以及任务照片轮播表,各表可通过唯一标识符‘slug’字段进行关联与合并。这种构建方式确保了数据的完整性、时效性与可追溯性,为航天任务分析提供了坚实的数据基础。
特点
该数据集以其全面性与精细的结构化设计而著称,完整收录了SpaceX公司旗下所有型号运载火箭的发射历史。其特点在于不仅包含发射日期、运载工具、发射场等基础元数据,还深入整合了任务时间线、照片图集以及任务状态、着陆信息等动态维度。数据以Parquet格式存储,兼具高性能与高压缩率,特别适合进行表格分类任务。此外,数据集每日通过自动化流程更新,确保了信息的时效性,为研究商业航天的发射频率、任务类型演变及可重复使用技术发展提供了宝贵资源。
使用方法
利用该数据集进行研究分析,首先需通过Hugging Face的`datasets`库加载数据。用户可选择加载核心的发射记录表,或进一步加载时间线及照片表进行关联分析。数据集支持转换为Pandas DataFrame,便于进行数据筛选、聚合与可视化操作。例如,研究者可轻松筛选特定型号火箭的发射记录,或按年份统计发射频次以分析其业务增长趋势。对于更深入的分析,可结合时间线数据研究特定任务的发射前后事件序列,从而洞察任务执行的详细流程。
背景与挑战
背景概述
SpaceX Launch History 数据集由 Julien Simon 于2026年整理并发布,隶属于轨道力学数据集系列,旨在系统记录 SpaceX 公司自成立以来的全部发射任务。该数据集源自 SpaceX 官方内容 API,全面涵盖了 Falcon 1、Falcon 9、Falcon Heavy 及 Starship 等运载工具的发射历史,包含任务描述、时间线事件及影像资料等多维元数据。其核心研究问题聚焦于通过结构化数据揭示可重复使用火箭技术如何重塑航天经济格局,并为航天运输系统的可靠性、发射频率及任务分类提供实证分析基础。作为开放科学资源,该数据集推动了航天工程、运筹学及商业航天领域的量化研究,成为评估现代发射系统性能与演进趋势的关键基准。
当前挑战
该数据集致力于解决航天任务分类与性能预测的领域挑战,具体包括从高维度、异构的发射记录中自动识别任务模式,以及基于历史数据评估火箭可重复使用性对发射成本与成功率的影响。构建过程中的挑战主要体现在数据整合与质量保障层面:原始数据来自动态更新的官方 API,需处理不同任务阶段(如进行中、已完成、计划中)的状态不一致性,并克服部分字段(如返回时间、任务结束时间)的高缺失率;同时,需将非结构化的任务描述、时间线事件及多媒体内容转化为可关联的表格数据,确保跨表连接(如通过 slug 字段)的完整性与准确性,以支持稳健的统计分析。
常用场景
经典使用场景
在航天工程与商业航天领域,SpaceX发射历史数据集为研究人员提供了详尽的发射记录,涵盖了从猎鹰1号到星舰的每一次任务。该数据集最经典的使用场景在于支持基于表格数据的分类任务,例如根据任务类型、运载火箭型号或发射地点对发射任务进行自动化分类。通过整合发射时间线、照片库等多元信息,研究者能够深入分析发射序列的规律性,识别不同任务模式的特征,从而为航天任务规划与风险评估提供数据支撑。
解决学术问题
该数据集有效解决了航天经济学与可重复使用运载技术研究中的关键问题。通过记录每次发射的详细元数据,包括一级火箭回收状态、任务成功率及发射成本等,学者能够量化评估SpaceX通过火箭复用实现的成本降低效应。这为研究商业航天模式创新、轨道力学优化以及发射频率与可靠性之间的平衡提供了实证基础,推动了航天系统工程与运营管理领域的学术进展。
衍生相关工作
基于该数据集,衍生出了一系列经典研究工作,主要集中在航天数据科学领域。例如,结合发射成本数据集进行经济性建模,分析可重复使用火箭对降低近地轨道发射单价的影响;与星链舰队数据集成,研究大规模星座部署与发射节奏的关联性。此外,学者利用时间线事件数据构建发射流程知识图谱,开发预测模型以优化任务时序安排,这些工作显著丰富了航天信息学与运营研究的方法体系。
以上内容由遇见数据集搜集并总结生成



