five

juliensimon/kepler-observations

收藏
Hugging Face2026-04-18 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/juliensimon/kepler-observations
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 pretty_name: "Kepler Observation Catalog" language: - en description: "The Kepler Observation Catalog indexes every target observed by NASA's Kepler space telescope during its prime mission (2009–2013), drawn from the Mikulski Archive for Space Telescopes (MAST). Kepler " task_categories: - tabular-classification tags: - space - kepler - k2 - nasa - exoplanets - astronomy - telescope - photometry - open-data - tabular-data - parquet size_categories: - 100K<n<1M configs: - config_name: default data_files: - split: train path: data/kepler_observations.parquet default: true --- # Kepler Observation Catalog <div align="center"> <img src="banner.jpg" alt="Artist concept of NASA's Kepler space telescope in orbit, surrounded by a starfield" width="400"> <p><em>Credit: NASA/Ames/JPL-Caltech</em></p> </div> *Part of a [dataset collection](https://huggingface.co/collections/juliensimon/astronomy-datasets-69c24caf2f17e36128946743) on Hugging Face.* ## Dataset description The Kepler Observation Catalog indexes every target observed by NASA's Kepler space telescope during its prime mission (2009–2013), drawn from the Mikulski Archive for Space Telescopes (MAST). Kepler is the most successful exoplanet-hunting mission in history: by continuously monitoring ~200,000 stars in a single 100-square-degree field of view in Cygnus–Lyra, it discovered the majority of confirmed exoplanets through high-precision photometric transit detection, including the first Earth-sized planets in habitable zones. Each row in this catalog is one Kepler target — identified by its 9-digit Kepler Input Catalog (KIC) ID — with the cadence at which it was observed (long cadence = 29.4-minute integration, capable of catching transits on weeks-to-months orbital periods; short cadence = 58.9-second integration, used for asteroseismology and short-period transits), the pointing (RA/Dec), and a 17-character bitmask indicating which of Kepler's 17 quarterly observing periods contain data for that target. The `quarters_observed` column summarises the mask as an integer count — a target observed in all 17 quarters has the longest, most exoplanet-favourable light curve in the archive. This dataset is designed for cross-matching with other exoplanet catalogs (Kepler confirmed planets, TESS TOI, Gaia DR3), for selecting targets with long baselines for long-period planet searches, and for understanding the Kepler field's completeness. It complements the Kepler eclipsing binary and transit timing variation catalogs already in this collection by providing the full target list. Each target's raw and de-trended light curves can be retrieved from MAST using the `obs_id`. The catalog is derived from MAST's CAOM table `dbo.caomobservation` (collection = 'KEPLER'). The K2 extended mission (2014–2018) uses a different observation-id schema and is published as a separate dataset (planned). The Kepler prime-mission archive is static, so this dataset is refreshed quarterly to pick up any late-stage reprocessing. This dataset is suitable for **tabular classification** tasks. ## Schema | Column | Type | Description | Sample | Null % | |--------|------|-------------|--------|--------| | `obs_id` | string | MAST observation identifier (e.g., 'kplr000757076_lc_Q111111111111111111'); encodes KIC ID, cadence (lc=long, sc=short), and Q-flags for each of the 17 Kepler quarters (1=observed, 0=not). Primary key. | kplr000757076_lc_Q111111111111111111 | 0.0% | | `intent` | string | Observation intent: 'science' (target star monitoring) or 'calibration' | science | 0.0% | | `target_ra` | float64 | Target right ascension in decimal degrees (ICRS). Kepler observed a fixed ~100 sq. deg. field near RA 290°, Dec 45° in Cygnus-Lyra. | 291.03872 | 0.0% | | `target_dec` | float64 | Target declination in decimal degrees (ICRS) | 36.59813 | 0.0% | | `kic_id` | Int64 | Kepler Input Catalog identifier (9-digit integer) for the target star; shared with the NASA Exoplanet Archive | 757076 | 0.8% | | `cadence` | string | Cadence type: 'lc' (long cadence, 29.4-minute integration) or 'sc' (short cadence, 58.9-second integration) | lc | 0.8% | | `quarters_mask` | string | 17-character string of '1'/'0' flags marking which Kepler quarters contain data for this target (Q1–Q17, in order) | 111111111111111111 | 0.8% | | `quarters_observed` | Int64 | Number of Kepler quarters (out of 17) in which the target was observed; higher = longer light-curve baseline | 18 | 0.0% | ## Quick stats - **212,993** Kepler prime-mission observations (2009–2013) - **207,656** long cadence (29.4 min), **3,724** short cadence (58.9 s) - **82,915** targets observed in all 17 Kepler quarters (maximum baseline) - **207,656** distinct Kepler Input Catalog (KIC) targets ## Usage ```python from datasets import load_dataset ds = load_dataset("juliensimon/kepler-observations", split="train") df = ds.to_pandas() ``` ```python from datasets import load_dataset ds = load_dataset("juliensimon/kepler-observations", split="train") df = ds.to_pandas() # Targets with longest baseline (all 17 quarters) full_baseline = df[(df["quarters_observed"] == 17) & (df["cadence"] == "lc")] print(f"Targets observed across the full Kepler prime mission: {len(full_baseline):,}") # Map of Kepler field import matplotlib.pyplot as plt sample = df.sample(min(50000, len(df))) plt.figure(figsize=(10, 8)) plt.scatter(sample["target_ra"], sample["target_dec"], s=0.2, alpha=0.3) plt.xlabel("RA (deg)"); plt.ylabel("Dec (deg)") plt.title("Kepler prime-mission field of view (50K sample)") plt.gca().invert_xaxis() plt.show() # Cadence distribution per quarter import numpy as np mask_chars = np.array([list(m) for m in df["quarters_mask"].fillna("0" * 17)]) per_quarter = (mask_chars == "1").sum(axis=0) plt.bar(range(1, 18), per_quarter) plt.xlabel("Kepler quarter"); plt.ylabel("Targets observed") plt.title("Kepler target count per quarter") plt.show() ``` ## Data source https://archive.stsci.edu/missions-and-data/kepler ## Update schedule Quarterly (1st of Jan/Apr/Jul/Oct at 14:00 UTC) via [GitHub Actions](https://github.com/juliensimon/space-datasets). ## Related datasets - [juliensimon/kepler-eclipsing-binaries](https://huggingface.co/datasets/juliensimon/kepler-eclipsing-binaries) - [juliensimon/kepler-transit-timing](https://huggingface.co/datasets/juliensimon/kepler-transit-timing) - [juliensimon/nasa-exoplanets](https://huggingface.co/datasets/juliensimon/nasa-exoplanets) - [juliensimon/tess-toi-candidates](https://huggingface.co/datasets/juliensimon/tess-toi-candidates) - [juliensimon/hst-observations](https://huggingface.co/datasets/juliensimon/hst-observations) - [juliensimon/jwst-observations](https://huggingface.co/datasets/juliensimon/jwst-observations) > If you find this dataset useful, please consider [giving it a like](https://huggingface.co/datasets/juliensimon/kepler-observations) on Hugging Face. It helps others discover it. ## About the author Created by [Julien Simon](https://julien.org) — AI Operating Partner at Fortino Capital. Part of the [Space Datasets](https://julien.org/datasets) collection. ## Citation ```bibtex @dataset{kepler_observations, title = {Kepler Observation Catalog}, author = {juliensimon}, year = {2026}, url = {https://huggingface.co/datasets/juliensimon/kepler-observations}, publisher = {Hugging Face} } ``` ## License [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/)
提供机构:
juliensimon
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作