RMDig/rocky_mountain_snowpack
收藏Hugging Face2025-08-08 更新2025-11-30 收录
下载链接:
https://hf-mirror.com/datasets/RMDig/rocky_mountain_snowpack
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: image
dtype: Image
- name: file_path
dtype: string
- name: datatype
dtype:
class_label:
names:
0: 'core'
1: 'profile'
2: 'magnified_profile'
3: 'crystal_card'
- name: site
dtype: int64
- name: column
dtype: int64
- name: core
dtype: int64
- name: segment
dtype: int64
- name: core_temperature
dtype: float32
- name: air_temperature
dtype: float32
- name: ascending_mountain
dtype: string
- name: city_state_country
dtype: string
- name: collector
dtype: string
- name: coordinates
dtype:
list:
dtype: float32
- name: date
dtype: string
- name: time
dtype: string
- name: snowpack_depth
dtype: float32
- name: core_depth
dtype: float32
- name: slope_face
dtype: float32
- name: slope_angle
dtype: float32
- name: avalanches_spotted
dtype: int64
- name: wind_loading
dtype:
class_label:
names:
0: 'none'
1: 'low'
2: 'moderate'
3: 'high'
- name: notes
dtype: string
configs:
- config_name: default
data_files:
- split: train
path: "metadata/train.jsonl"
- split: test
path: "metadata/test.jsonl"
- split: validation
path: "metadata/validation.jsonl"
- split: raw
path: "metadata/raw.jsonl"
- split: preprocessed
path: "metadata/preprocessed.jsonl"
annotations_creators:
- manual
language:
- en
license: cc-by-4.0
multilinguality: monolingual
pretty_name: Rocky Mountain Dataset
size_categories:
- 1K<n<10K
source_datasets: "None"
task_categories:
- image-classification
task_ids:
- multi-label-classification
tags:
- image
paperswithcode_id: null
---
# Rocky Mountain Snowpack Dataset
The Rocky Mountain Snowpack dataset contains ~2,341 samples of snowpack imagery collected in the Colorado Rocky Mountains during the 2024–2025 winter season.
Each sample segment of snow includes three types of images:
- **Magnified crystal images** (close-up snow snow crystal profile photography)
- **Snowpack profile images** (non-magnified snow crystal profiles photography)
- **Core segment images** (snow cores sampled with a apple corer)
For each segment that naturally forms from the core, each image type is captured and resampled in real-time from the same snow segment up to 25 times using the shake-n-take method. This theoretically increases training sample size quadratically in multi-modal models that take two or more of the image types as input simply by creating all potential pairs of resampled segment images. **Forewarning**, this may introduce unintentional data-leakage (especially when using both magnified crystal images and their non-magnified images), however it may be a necessary oversampling technique given the minimal amount of data available.
In addition to images, the dataset includes rich details of environmental metadata such as number of avalanches spotted, slope angle, air and snow temperature. The samples are attached to a time and location so external metadata such as weather forecasts could easily be used as labels for the data.
The dataset is designed for **image classification**, **regression**, and **generative modeling** tasks related to snow science and avalanche forecasting.
Explore the dataset more by visiting **Rocky Mountain Digerati's** website at **www.rmdig.ai**
## Dataset Summary
- **Number of examples**: 2216
- **Features**: `image`,`site`,`column`,`core`,`segment`,`avalanches_spotted`,`wind_loading`,`snowpack_depth`,`core_depth`,`slope_face`,`slope_angle`,`air_temperature`,`snow_temperature`
- **Label classes**: `datatype`,`site`,`column`,`core`,`segment`,`avalanches_spotted`,`wind_loading`,`snowpack_depth`,`core_depth`,`slope_face`,`slope_angle`,`air_temperature`,`snow_temperature`
- **Data format**: PNG images, CSV labels
- **Splits**: `train`, `test`, `validation`
- **Languages**: English (for labels)
## Features
| **Feature** | **Type** | **Description** |
|-------------------------|---------|-----------------|
| `image` | image | Picture of snow in various states|
| `datatype` | class | magnified_profiles, profiles, cores, crystal_cards|
| `site` | int | 1, 2, 3, etc.|
| `column` | int | 1, 2, 3, etc.|
| `core` | int | 1, 2, 3, etc.|
| `segment` | int | 1, 2, 3, etc.|
| `avalanches_spotted` | int | 0, 1, 2, 3, etc. |
| `wind_loading` | class | none, low, moderate, high |
| `snow_feature` | class | magnified-snow, snow, snow-cores |
| `snowpack_depth` | Float | Total depth of snowpack (cm) |
| `core_depth` | Float | Depth of image taken (cm) |
| `slope_face` | Float | Slope face orientation in degrees |
| `slope_angle` | Float | Slope angle in degrees |
| `air_temperature` | Float | Air temperature (°F) |
| `core_temperature` | Float | Snow core (layer) temperature (°F) |
## Labels
This dataset provides multiple classification and regression labels for each sample:
| **Classification Labels** | **Type** | **Possible Values / Description** |
|--------------------------|----------------|------------------------------------------------------------------------|
| `datatype` | Class (4) | The site number data was collected (e.g., `magnified_profile`, `core`, etc.) |
| `site` | Int | The site number data was collected (e.g., `1`, `2`, `3`, etc.) |
| `column` | Int | The snowpack column data was collected (e.g., `1`, `2`, `3`, etc.) |
| `core` | Int | The core data was collected (e.g., `1`, `2`, `3`, etc.) |
| `segment` | Int | The segment of a core data was collected from(e.g., `1`, `2`, `3`, etc.) |
| `avalanches_spotted` | Int | Number of avalanches spotted nearby (e.g. `1`, `2`, `3`) |
| `wind_loading` | Class (4) | `["none", "low", "medium", "high"]` |
| **Regression Labels** | **Type** | **Possible Values / Description** |
|--------------------------|----------------|------------------------------------------------------------------------|
| `snowpack_depth` | Float | Depth of the snowpack in cm (e.g., `-5`) |
| `core_depth` | Float | Depth of the snowpack segment pictured in cm (e.g., `-5`) |
| `slope_face` | Float | Slope face in degrees (e.g. `78`) |
| `slope_angle` | Float | Slope angle in degrees (e.g. `32`) |
| `air_temperature` | Float | Air temperature in °F (e.g., `8`) |
| `core_temperature` | Float | Snow temperature in °C (e.g., `-3`) |
## Supported Tasks and Leaderboards
The dataset supports **image classification and regression** tasks, **generative AI** projects and can be evaluated using standard accuracy, F1-score metrics or qualitative analysis of generative images. Examples of AI trained on this dataset are the **snowGAN** and **coreDiff** training on the magnified profiles of snow and snow cores respectively apart of this dataset.
## Dataset Structure
### Data Instances
Example:
{'image': <PIL.PngImagePlugin.PngImageFile image mode=RGB size=500x300 at 0x138F7BD70>,'file_path': 'preprocessed/cores/image_1.png', 'datatype': 0, 'site': 0, 'column': 1, 'core': 1, 'segment': -1, 'core_temperature': 41.0, 'air_temperature': 7.0, 'ascending_mountain': 'Loveland Pass', 'city_state_country': 'Silver Plume, Colorado, USA', 'collector': 'Denny Schaedig', 'coordinates': [39.65999984741211, -105.87999725341797], 'date': '1/12/25', 'time': '11:40 AM MST', 'snowpack_depth': 140.0, 'core_depth': 10.0, 'slope_face': 30.0, 'slope_angle': 11.0, 'avalanches_spotted': 2, 'wind_loading': 3, 'notes': 'Pilot using slightly modified coring method that took too large segment for snow profile picturing, advised not to use in higher level models. Two avalanche on south face nearby, partially cloudy with light snow in the last 24 hours. Lots of wind loading on eastern features.'}
### Data Fields
- `image`: The image location.
- `file_path`: Relative filepath of the image within the dataset repo
- `datatype`: The datatype of the image (i.e. core, profile, magnified_profile, crystal_card)
- `site`: The site number data was collected from
- `column`: The snowpack column number data was collected from
- `core`: The snowpack column number data was collected from
- `segment`: The segment number data was collected from
- `core_temperature`: Core temperature recorded for the core/profile
- `air_temperature`: Air temperature recorded at the start of the site collection
- `ascending_mountain`: The mountain on approach to the collection site
- `city_state_country`: City, state and country the site was closest too
- `collector`: Name of the person collecting data
- `coordinates`: Coordinates of the snowpit samples was collected from
- `date`: Date the snowpit was dug and sampled
- `time`: Time the snowpit collection started
- `snowpack_depth`: Depth of the snowpack in cm
- `core_depth`: Depth of the core sampled from in cm
- `slope_face`: The slope face angle.
- `slope_angle`: The angle of the slope.
- `avalanches_spotted`: Number of avalanches spotted on ascending mountain
- `wind_loading`: Level of wind loading features observed on nearby mountains
- `notes`: Additional notes provided by the collector about data collection from the site and it's nearby surroundings
## Usage
```python
from datasets import load_dataset
# Load the rocky mountain snowpack repo
dataset = load_dataset("rmdig/rocky_mountain_snowpack")
# Grab first sample in training set
sample = dataset["train"][0]
# Grab the image apart of the sample
image = sample["image"]
# Show the image
image.show()
```
## License
This dataset is distributed under the [CC-BY 4.0](https://creativecommons.org/licenses/by/4.0/) license.
## Citation
@dataset{Schaedig2025RockyMountain,
title = {Rocky Mountain Snowpack Dataset},
author = {Denny Schaedig},
year = {2025},
publisher = {RMDig.ai},
license = {CC-BY 4.0},
url = {https://huggingface.co/datasets/rmdig/rocky-mountain-snowpack}
}
Rocky Mountain Snowpack Dataset
提供机构:
RMDig



