surfe-diem/wave-archive-USA-southwest
收藏Hugging Face2026-03-31 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/surfe-diem/wave-archive-USA-southwest
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
category: time-series
task_categories:
- time-series-forecasting
tags:
- oceanography
- climate
- buoy-data
- california
pretty_name: Surfe-Diem Wave Archive (USA Southwest)
size_categories:
- 1M<n<10M
---
# Dataset Card for wave-archive-USA-southwest
<!-- Provide a quick summary of the dataset. -->
A dataset of NDBC/NOAA stdmet and spectral readings since 1991 for the US-Southwest Pacific coast.
## Dataset Details
### Dataset Description
<!-- Provide a longer summary of what this dataset is. -->
This is historical data from the NOAA/NDBC consisting of the following types:
- Standard Meteorological: "stdmet
- Continuous Wind: "cwind"
- Spectral Wave Density: "swden"
- Spectral Wave Direction (α₁): "swdir"
- Directional Spreading (R₁): "swr1"
The spectral data is aligned by timestamp to the stdmet data.
- **Curated by:** [[crubio](https://github.com/crubio)]
## Uses
<!-- Address questions around how the dataset is intended to be used. -->
Time-series input
### Direct Use
<!-- This section describes suitable use cases for the dataset. -->
[More Information Needed]
### Out-of-Scope Use
<!-- This section addresses misuse, malicious use, and uses that the dataset will not work well for. -->
[More Information Needed]
## Dataset Structure
**NDBC File Types**
| Type | Code | Description |
|------|------|-------------|
| stdmet | h | Standard Meteorological
| swden | w | Spectral Wave Density (Energy)
| swdir | d | Spectral Direction (α₁)
| swr1 | j | Directional Spread (R₁)
**Standard Met Columns**
| Column | Description | Units |
|--------|-------------|-------|
| `WDIR` | Wind Direction | °T |
| `WSPD` | Wind Speed | m/s |
| `GST` | Peak Gust | m/s |
| `WVHT` | Significant Wave Height | m |
| `DPD` | Dominant Wave Period | s |
| `APD` | Average Wave Period | s |
| `MWD` | Mean Wave Direction | °T |
| `PRES` | Sea Level Pressure | hPa |
| `ATMP` | Air Temperature | °C |
| `WTMP` | Water Temperature | °C |
| `DEWP` | Dewpoint Temperature | °C |
**Spectral Columns (aligned profile only)**
| Prefix | Source | Description | Units |
|--------|--------|-------------|-------|
| `energy_` | swden | Spectral power at each frequency | m²/Hz |
| `alpha1_` | swdir | Mean wave direction at each frequency | degrees |
| `r1_` | swr1 | Directional spreading coefficient | normalized 0–1 |
**47 frequency bins** from `0.020 Hz` to `0.485 Hz` per NDBC standard.
[More Information Needed]
## Dataset Creation
Data for one year for each type(where available) is downloaded, joined and aligned by hourly timestamp. Rows that do not meet a minimum threshold of data are dropped.
Sentinel values (999,NaN,99) are cleaned.
Typed parquet files are created for each station/year, then validated against schema and physical cross checks.
### Curation Rationale
<!-- Motivation for the creation of this dataset. -->
This data set was created with WVHT forecasting intention based on historical observation.
[More Information Needed]
### Source Data
<!-- This section describes the source data (e.g. news text and headlines, social media posts, translated sentences, ...). -->
The source data is indexed here in downloadable format: https://www.ndbc.noaa.gov/data/historical/
#### Data Collection and Processing
<!-- This section describes the data collection and processing process such as data selection criteria, filtering and normalization methods, tools and libraries used, etc. -->
An inital availability scan is done to give us a list of exactly which files are available; no unnecessary network requests or scraping is done.
Data sets are either stdmet or aligned - the later having one or more of the spectral files in addition to stdmet.
Data sets are created (see Dataset Creation), and the parquet files are bulk validated (schema and physical cross checking), each having a matching validation sidecar json file.
**Tools/libraries used**
- Python: pandas, pyarrow, requests, etc.
- pytest test suite.
- pathlib, argparse, logging, json.
[More Information Needed]
#### Who are the source data producers?
<!-- This section describes the people or systems who originally created the data. It should also include self-reported demographic or identity information for the source data creators if this information is available. -->
[NOAA](www.noaa.gov)
[NDBC](https://www.ndbc.noaa.gov/)
#### Annotation process
<!-- This section describes the annotation process such as annotation tools used in the process, the amount of data annotated, annotation guidelines provided to the annotators, interannotator statistics, annotation validation, etc. -->
Quality checking and cross checking of fields is automated and configured to be within constraints recommended by NDBC quality control. No human assisted labeling was done. See https://www.ndbc.noaa.gov/publications/NDBCHandbookofAutomatedDataQualityControl2023.pdf for more info.
#### Who are the annotators?
<!-- This section describes the people or systems who created the annotations. -->
www.surfe-diem.com data platform and ETL.
#### Personal and Sensitive Information
None
## Bias, Risks, and Limitations
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
[More Information Needed]
### Recommendations
<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
Users should be made aware of the risks, biases and limitations of the dataset. More information needed for further recommendations.
## Citation [optional]
<!-- If there is a paper or blog post introducing the dataset, the APA and Bibtex information for that should go in this section. -->
- Measurement Descriptions and Units: ndbc.noaa.gov/faq/measdes.shtml
- Hourly Data File Formats: ndbc.noaa.gov/faq/hourly.shtml
- Real-Time Data Access FAQ: ndbc.noaa.gov/faq/rt_data_access.shtml
- Historical Data Layouts: ndbc.noaa.gov/historical_data.shtml
- Web Data Guide (PDF): ndbc.noaa.gov/docs/ndbc_web_data_guide.pdf - Overview of data access methods.
- Wave Data Analysis Procedures: https://www.ndbc.noaa.gov/wavemeas.pdf
**BibTeX:**
[More Information Needed]
**APA:**
[More Information Needed]
## Glossary [optional]
<!-- If relevant, include terms and calculations in this section that can help readers understand the dataset or dataset card. -->
## Dataset Card Authors [optional]
Christopher Rubio
## Dataset Card Contact
chris@surfe-diem.com
提供机构:
surfe-diem



