five

juliensimon/comet-catalog

收藏
Hugging Face2026-04-01 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/juliensimon/comet-catalog
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc0-1.0 pretty_name: "Comet Catalog" language: - en description: >- Catalog of comets sourced from Wikidata, including orbital parameters, discovery dates, discoverers, and named-after information. 1,278 comets with orbital mechanics data. size_categories: - 1K<n<10K task_categories: - tabular-classification tags: - space - comets - orbital-mechanics - wikidata - open-data - tabular-data - parquet configs: - config_name: default default: true data_files: - split: train path: data/comets.parquet --- # Comet Catalog *Part of the [Orbital Mechanics Datasets](https://huggingface.co/collections/juliensimon/orbital-mechanics-datasets-69c24caca4ab3934c9856994) collection on Hugging Face.* Catalog of **1,278** comets sourced from [Wikidata](https://www.wikidata.org/), covering orbital parameters, discovery history, and naming origins. ## Dataset description Comets are small icy bodies that develop a coma and tails when approaching the Sun. They originate from the Kuiper Belt and Oort Cloud and follow highly eccentric orbits ranging from short-period comets (< 200 years) to long-period and hyperbolic visitors. This dataset aggregates structured comet data from Wikidata's SPARQL endpoint, capturing orbital mechanics (period, perihelion distance, eccentricity, inclination), discovery metadata (date, discoverer), and cultural information (named-after entities). It covers historically significant comets like Halley's Comet and Hale-Bopp through recently discovered objects. The data enables studies of comet population statistics, orbital dynamics, discovery rate trends over time, and the history of comet observation and naming conventions. ## Schema | Column | Type | Description | |--------|------|-------------| | `wikidata_id` | string | Wikidata entity ID (e.g. Q1390) | | `name` | string | Comet name or designation | | `discovery_date` | string | Date of discovery (YYYY-MM-DD) | | `discoverer` | string | Name of discoverer(s) | | `orbital_period_yr` | float | Orbital period (years) | | `perihelion_au` | float | Perihelion distance (AU) | | `eccentricity` | float | Orbital eccentricity | | `inclination_deg` | float | Orbital inclination (degrees) | | `named_after` | string | Entity the comet is named after | | `epoch` | string | Orbital element epoch (YYYY-MM-DD) | | `discovery_year` | int | Year of discovery (derived) | ## Quick stats - **1,278** comets in catalog - **813** with orbital period data - **1,225** with perihelion distance - **1,227** with eccentricity - **728** with named discoverer - Discovery years: -42 – 2026 - Top discoverers: Lincoln Near-Earth Asteroid Research (55), Pan-STARRS (28), Jean-Louis Pons (25), Andrea Boattini (22), Near-Earth Asteroid Tracking (20) ## Usage ```python from datasets import load_dataset ds = load_dataset("juliensimon/comet-catalog", split="train") df = ds.to_pandas() # Short-period comets (period < 200 years) short_period = df[df["orbital_period_yr"] < 200].dropna(subset=["orbital_period_yr"]) print(f"{len(short_period):,} short-period comets") # Most recently discovered comets recent = df.dropna(subset=["discovery_year"]).nlargest(10, "discovery_year") print(recent[["name", "discovery_year", "discoverer"]]) # Highly eccentric comets (near-parabolic or hyperbolic) high_ecc = df[df["eccentricity"] >= 0.99].dropna(subset=["eccentricity"]) print(high_ecc[["name", "eccentricity", "orbital_period_yr"]]) # Comets by perihelion distance inner = df[df["perihelion_au"] < 0.3].dropna(subset=["perihelion_au"]) print(f"{len(inner):,} sungrazing comets (perihelion < 0.3 AU)") ``` ## Data source [Wikidata](https://www.wikidata.org/) SPARQL endpoint. Comets identified via P31 (instance of) / P279* (subclass of) traversal from Q3559 (comet). Data is community-curated by [WikiProject Astronomy](https://www.wikidata.org/wiki/Wikidata:WikiProject_Astronomy). ## Update schedule Quarterly (January, April, July, October). Run `python scripts/update-comets.py` manually to refresh. ## Related datasets - [mpc-comet-elements](https://huggingface.co/datasets/juliensimon/mpc-comet-elements) -- MPC orbital elements for comets - [jpl-small-body-database](https://huggingface.co/datasets/juliensimon/jpl-small-body-database) -- JPL small body orbital data - [fireball-bolide-events](https://huggingface.co/datasets/juliensimon/fireball-bolide-events) -- Fireball and bolide events ## Pipeline Source code: [juliensimon/space-datasets](https://github.com/juliensimon/space-datasets) ## Support If you find this dataset useful, please give it a ❤️ on the [dataset page](https://huggingface.co/datasets/juliensimon/comet-catalog) and share feedback in the Community tab! Also consider giving a ⭐️ to the [space-datasets](https://github.com/juliensimon/space-datasets) repo. ## Citation ```bibtex @dataset{comet_catalog, author = {Simon, Julien}, title = {Comet Catalog}, year = {2026}, publisher = {Hugging Face}, url = {https://huggingface.co/datasets/juliensimon/comet-catalog}, note = {Sourced from Wikidata (CC0)} } ``` ## License [CC0-1.0](https://creativecommons.org/publicdomain/zero/1.0/) (Wikidata content is public domain)
提供机构:
juliensimon
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作