juliensimon/henry-draper-catalog
收藏Hugging Face2026-05-26 更新2026-05-31 收录
下载链接:
https://hf-mirror.com/datasets/juliensimon/henry-draper-catalog
下载链接
链接失效反馈官方服务:
资源简介:
---
license: other
license_name: vizier-scientific-use
license_link: https://cds.unistra.fr/vizier-org/licences_vizier.html
pretty_name: "Henry Draper Catalogue"
language:
- en
description: "The Henry Draper Catalogue (HD) is the foundational reference for stellar spectral classification, containing 272,150 stars with spectral types assigned by Annie Jump Cannon at the Harvard College Obs"
task_categories:
- tabular-classification
tags:
- space
- stellar
- spectral-classification
- astronomy
- henry-draper
- stellar-catalog
- open-data
- tabular-data
- parquet
size_categories:
- 100K<n<1M
configs:
- config_name: default
data_files:
- split: train
path: data/henry_draper.parquet
default: true
---
# Henry Draper Catalogue
<div align="center">
<img src="banner.jpg" alt="Dense stellar field — the stars classified in the Henry Draper Catalogue" width="400">
<p><em>Credit: NASA/ESA/Hubble Heritage</em></p>
</div>
*Part of a [dataset collection](https://huggingface.co/collections/juliensimon/stellar-catalogs-69c792b1a52ab2757b0eaa57) on Hugging Face.*
## Dataset description
The Henry Draper Catalogue (HD) is the foundational reference for stellar spectral classification, containing 272,150 stars with spectral types assigned by Annie Jump Cannon at the Harvard College Observatory. Published in the Harvard Annals between 1918 and 1924, it established the OBAFGKM spectral sequence that remains in use today.
Annie Jump Cannon classified each star visually by examining the pattern of absorption lines on photographic plates, working at a rate of about 300 stars per hour. Her one-dimensional spectral sequence (O, B, A, F, G, K, M, plus R, N, S for carbon and zirconium-rich stars) encodes stellar surface temperature: O stars are the hottest (>30,000 K, ionised helium) and M stars the coolest (<3,500 K, molecular bands). The HD number (HD 1–272150) became the universal stellar identifier used in astrophysical literature throughout the 20th century and remains widely cited today.
Stars are classified to the nearest half subtype (e.g., G2V for the Sun). The luminosity class suffix — Ia (supergiant), II (bright giant), III (giant), IV (subgiant), V (main-sequence/dwarf) — was added for stars in the extension catalogs. The colour index B–V (photo_mag – photo_visual_mag) provides an independent temperature indicator correlated with spectral type, enabling photometric classification for very large samples.
This catalogue fills the key gap of stellar spectral types in the collection: no other dataset provides canonical MK classifications for a large, all-sky stellar sample. It is the natural complement to the Hipparcos parallax catalogue (astrometry), the Gaia DR3 datasets (photometry and radial velocities), and the GCVS variable star catalogue (variability). It enables HR-diagram studies, population synthesis, and training spectral-type classifiers.
This dataset is suitable for **tabular classification** tasks.
## Schema
| Column | Type | Description | Sample | Null % |
|--------|------|-------------|--------|--------|
| `hd_number` | int64 | Henry Draper Catalogue number (HD 1–272150) — the primary stellar identifier in this catalog | 18667 | 0.0% |
| `dm_designation` | string | Durchmusterung identifier (BD, CD, or CP prefix) giving the Bonner/Córdoba/Cape catalog cross-match | BD-00 471 | 6.9% |
| `photo_mag_quality` | float64 | Quality code for photo_mag: 0=normal, other codes indicate uncertain or combined measurements | 0.0 | 14.4% |
| `photo_mag` | float64 | Photographic magnitude (blue-sensitive plates, ~B band equivalent); brighter = lower number | 8.7 | 14.3% |
| `photo_visual_mag_quality` | float64 | Quality code for photo_visual_mag: 0=normal | 1.0 | 3.9% |
| `photo_visual_mag` | float64 | Photovisual magnitude (orthochromatic plates, ~V band equivalent) | 9.7 | 0.5% |
| `spectral_type` | string | MK spectral type and luminosity class (e.g. G2V, K0III, A3m) — the primary classification product | K0 | 0.0% |
| `intensity_code` | string | Relative intensity indicator (1–6) used during classification; relates to plate exposure | 4 | 19.4% |
| `remarks` | string | Remarks field: note on unusual classification, binary, or catalog flags | M | 90.9% |
| `ra_deg` | float64 | Right ascension (J2000, degrees) — computed by VizieR from B1900 coordinates | 45.03110305555555 | 0.0% |
| `dec_deg` | float64 | Declination (J2000, degrees) — computed by VizieR from B1900 coordinates | 0.2479338888888888 | 0.0% |
| `spectral_class` | string | Broad spectral class (O/B/A/F/G/K/M) extracted from spectral_type; null for unusual types (W, R, N, S, C) | K | 0.2% |
## Quick stats
- **272,150** stars with spectral type classifications (HD 1–272,150)
- **271,588** with OBAFGKM broad class; most common: A stars (72,155, 26.5%)
- Magnitudes span -1.6–50.0 (photographic)
- **272,150** stars with J2000 coordinates
## Usage
```python
from datasets import load_dataset
ds = load_dataset("juliensimon/henry-draper-catalog", split="train")
df = ds.to_pandas()
```
```python
from datasets import load_dataset
ds = load_dataset("juliensimon/henry-draper-catalog", split="train")
df = ds.to_pandas()
# Spectral type distribution
import matplotlib.pyplot as plt
order = list("OBAFGKM")
counts = df["spectral_class"].value_counts().reindex(order, fill_value=0)
counts.plot(kind="bar", color="steelblue")
plt.xlabel("Spectral class")
plt.ylabel("Count")
plt.title("Henry Draper Catalogue — spectral class distribution")
plt.show()
# Colour-magnitude diagram (photo_mag vs B-V proxy)
import numpy as np
bv = df["photo_mag"] - df["photo_visual_mag"]
good = df[(df["photo_mag"] < 9) & bv.notna()]
plt.figure(figsize=(7, 8))
plt.scatter(bv[good.index], good["photo_mag"], s=0.3, alpha=0.2)
plt.gca().invert_yaxis()
plt.xlabel("B−V (photo_mag − photo_visual_mag)")
plt.ylabel("Photographic magnitude")
plt.title("HD Catalogue — colour-magnitude diagram")
plt.show()
```
## Data source
https://vizier.cds.unistra.fr/viz-bin/VizieR-3?-source=III/135A/catalog
## Related datasets
- [juliensimon/hipparcos-catalog](https://huggingface.co/datasets/juliensimon/hipparcos-catalog)
- [juliensimon/gaia-dr3-spectroscopic-binaries](https://huggingface.co/datasets/juliensimon/gaia-dr3-spectroscopic-binaries)
- [juliensimon/bright-stars](https://huggingface.co/datasets/juliensimon/bright-stars)
> If you find this dataset useful, please consider [giving it a like](https://huggingface.co/datasets/juliensimon/henry-draper-catalog) on Hugging Face. It helps others discover it.
## About the author
Created by [Julien Simon](https://julien.org) — AI Operating Partner at Fortino Capital. Part of the [Space Datasets](https://julien.org/datasets) collection.
## Citation
```bibtex
@dataset{henry_draper_catalog,
title = {Henry Draper Catalogue},
author = {juliensimon},
year = {2026},
url = {https://huggingface.co/datasets/juliensimon/henry-draper-catalog},
publisher = {Hugging Face}
}
```
## License
[VizieR Scientific-Use Terms](https://cds.unistra.fr/vizier-org/licences_vizier.html)
提供机构:
juliensimon



