juliensimon/comet-catalog
收藏Hugging Face2026-04-01 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/juliensimon/comet-catalog
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc0-1.0
pretty_name: "Comet Catalog"
language:
- en
description: >-
Catalog of comets sourced from Wikidata, including orbital parameters,
discovery dates, discoverers, and named-after information.
1,278 comets with orbital mechanics data.
size_categories:
- 1K<n<10K
task_categories:
- tabular-classification
tags:
- space
- comets
- orbital-mechanics
- wikidata
- open-data
- tabular-data
- parquet
configs:
- config_name: default
default: true
data_files:
- split: train
path: data/comets.parquet
---
# Comet Catalog
*Part of the [Orbital Mechanics Datasets](https://huggingface.co/collections/juliensimon/orbital-mechanics-datasets-69c24caca4ab3934c9856994) collection on Hugging Face.*
Catalog of **1,278** comets sourced from [Wikidata](https://www.wikidata.org/), covering
orbital parameters, discovery history, and naming origins.
## Dataset description
Comets are small icy bodies that develop a coma and tails when approaching the Sun.
They originate from the Kuiper Belt and Oort Cloud and follow highly eccentric orbits
ranging from short-period comets (< 200 years) to long-period and hyperbolic visitors.
This dataset aggregates structured comet data from Wikidata's SPARQL endpoint, capturing
orbital mechanics (period, perihelion distance, eccentricity, inclination), discovery
metadata (date, discoverer), and cultural information (named-after entities). It covers
historically significant comets like Halley's Comet and Hale-Bopp through recently
discovered objects.
The data enables studies of comet population statistics, orbital dynamics, discovery
rate trends over time, and the history of comet observation and naming conventions.
## Schema
| Column | Type | Description |
|--------|------|-------------|
| `wikidata_id` | string | Wikidata entity ID (e.g. Q1390) |
| `name` | string | Comet name or designation |
| `discovery_date` | string | Date of discovery (YYYY-MM-DD) |
| `discoverer` | string | Name of discoverer(s) |
| `orbital_period_yr` | float | Orbital period (years) |
| `perihelion_au` | float | Perihelion distance (AU) |
| `eccentricity` | float | Orbital eccentricity |
| `inclination_deg` | float | Orbital inclination (degrees) |
| `named_after` | string | Entity the comet is named after |
| `epoch` | string | Orbital element epoch (YYYY-MM-DD) |
| `discovery_year` | int | Year of discovery (derived) |
## Quick stats
- **1,278** comets in catalog
- **813** with orbital period data
- **1,225** with perihelion distance
- **1,227** with eccentricity
- **728** with named discoverer
- Discovery years: -42 – 2026
- Top discoverers: Lincoln Near-Earth Asteroid Research (55), Pan-STARRS (28), Jean-Louis Pons (25), Andrea Boattini (22), Near-Earth Asteroid Tracking (20)
## Usage
```python
from datasets import load_dataset
ds = load_dataset("juliensimon/comet-catalog", split="train")
df = ds.to_pandas()
# Short-period comets (period < 200 years)
short_period = df[df["orbital_period_yr"] < 200].dropna(subset=["orbital_period_yr"])
print(f"{len(short_period):,} short-period comets")
# Most recently discovered comets
recent = df.dropna(subset=["discovery_year"]).nlargest(10, "discovery_year")
print(recent[["name", "discovery_year", "discoverer"]])
# Highly eccentric comets (near-parabolic or hyperbolic)
high_ecc = df[df["eccentricity"] >= 0.99].dropna(subset=["eccentricity"])
print(high_ecc[["name", "eccentricity", "orbital_period_yr"]])
# Comets by perihelion distance
inner = df[df["perihelion_au"] < 0.3].dropna(subset=["perihelion_au"])
print(f"{len(inner):,} sungrazing comets (perihelion < 0.3 AU)")
```
## Data source
[Wikidata](https://www.wikidata.org/) SPARQL endpoint. Comets identified via
P31 (instance of) / P279* (subclass of) traversal from Q3559 (comet).
Data is community-curated by [WikiProject Astronomy](https://www.wikidata.org/wiki/Wikidata:WikiProject_Astronomy).
## Update schedule
Quarterly (January, April, July, October). Run `python scripts/update-comets.py` manually to refresh.
## Related datasets
- [mpc-comet-elements](https://huggingface.co/datasets/juliensimon/mpc-comet-elements) -- MPC orbital elements for comets
- [jpl-small-body-database](https://huggingface.co/datasets/juliensimon/jpl-small-body-database) -- JPL small body orbital data
- [fireball-bolide-events](https://huggingface.co/datasets/juliensimon/fireball-bolide-events) -- Fireball and bolide events
## Pipeline
Source code: [juliensimon/space-datasets](https://github.com/juliensimon/space-datasets)
## Support
If you find this dataset useful, please give it a ❤️ on the [dataset page](https://huggingface.co/datasets/juliensimon/comet-catalog) and share feedback in the Community tab! Also consider giving a ⭐️ to the [space-datasets](https://github.com/juliensimon/space-datasets) repo.
## Citation
```bibtex
@dataset{comet_catalog,
author = {Simon, Julien},
title = {Comet Catalog},
year = {2026},
publisher = {Hugging Face},
url = {https://huggingface.co/datasets/juliensimon/comet-catalog},
note = {Sourced from Wikidata (CC0)}
}
```
## License
[CC0-1.0](https://creativecommons.org/publicdomain/zero/1.0/) (Wikidata content is public domain)
提供机构:
juliensimon



