Yura32000/eurosat
收藏Hugging Face2024-01-17 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/Yura32000/eurosat
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: image
dtype: image
- name: label
dtype:
class_label:
names:
'0': AnnualCrop
'1': Forest
'2': HerbaceousVegetation
'3': Highway
'4': Industrial
'5': Pasture
'6': PermanentCrop
'7': Residential
'8': River
'9': SeaLake
- name: choices
dtype: int64
- name: prices
dtype: int64
splits:
- name: train
num_bytes: 73997723.2
num_examples: 21600
- name: test
num_bytes: 9241099.7
num_examples: 2700
- name: valid
num_bytes: 9232043.9
num_examples: 2700
download_size: 91992228
dataset_size: 92470866.80000001
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: test
path: data/test-*
- split: valid
path: data/valid-*
---
# Dataset Card for Dataset Name
<!-- Provide a quick summary of the dataset. -->
This dataset card aims to be a base template for new datasets. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/datasetcard_template.md?plain=1).
## Dataset Details
### Dataset Description
<!-- Provide a longer summary of what this dataset is. -->
- **Curated by:** [More Information Needed]
- **Funded by [optional]:** [More Information Needed]
- **Shared by [optional]:** [More Information Needed]
- **Language(s) (NLP):** [More Information Needed]
- **License:** [More Information Needed]
### Dataset Sources [optional]
<!-- Provide the basic links for the dataset. -->
- **Repository:** [More Information Needed]
- **Paper [optional]:** [More Information Needed]
- **Demo [optional]:** [More Information Needed]
## Uses
<!-- Address questions around how the dataset is intended to be used. -->
### Direct Use
<!-- This section describes suitable use cases for the dataset. -->
[More Information Needed]
### Out-of-Scope Use
<!-- This section addresses misuse, malicious use, and uses that the dataset will not work well for. -->
[More Information Needed]
## Dataset Structure
<!-- This section provides a description of the dataset fields, and additional information about the dataset structure such as criteria used to create the splits, relationships between data points, etc. -->
[More Information Needed]
## Dataset Creation
### Curation Rationale
<!-- Motivation for the creation of this dataset. -->
[More Information Needed]
### Source Data
<!-- This section describes the source data (e.g. news text and headlines, social media posts, translated sentences, ...). -->
#### Data Collection and Processing
<!-- This section describes the data collection and processing process such as data selection criteria, filtering and normalization methods, tools and libraries used, etc. -->
[More Information Needed]
#### Who are the source data producers?
<!-- This section describes the people or systems who originally created the data. It should also include self-reported demographic or identity information for the source data creators if this information is available. -->
[More Information Needed]
### Annotations [optional]
<!-- If the dataset contains annotations which are not part of the initial data collection, use this section to describe them. -->
#### Annotation process
<!-- This section describes the annotation process such as annotation tools used in the process, the amount of data annotated, annotation guidelines provided to the annotators, interannotator statistics, annotation validation, etc. -->
[More Information Needed]
#### Who are the annotators?
<!-- This section describes the people or systems who created the annotations. -->
[More Information Needed]
#### Personal and Sensitive Information
<!-- State whether the dataset contains data that might be considered personal, sensitive, or private (e.g., data that reveals addresses, uniquely identifiable names or aliases, racial or ethnic origins, sexual orientations, religious beliefs, political opinions, financial or health data, etc.). If efforts were made to anonymize the data, describe the anonymization process. -->
[More Information Needed]
## Bias, Risks, and Limitations
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
[More Information Needed]
### Recommendations
<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
Users should be made aware of the risks, biases and limitations of the dataset. More information needed for further recommendations.
## Citation [optional]
<!-- If there is a paper or blog post introducing the dataset, the APA and Bibtex information for that should go in this section. -->
**BibTeX:**
[More Information Needed]
**APA:**
[More Information Needed]
## Glossary [optional]
<!-- If relevant, include terms and calculations in this section that can help readers understand the dataset or dataset card. -->
[More Information Needed]
## More Information [optional]
[More Information Needed]
## Dataset Card Authors [optional]
[More Information Needed]
## Dataset Card Contact
[More Information Needed]
提供机构:
Yura32000
原始信息汇总
数据集信息
特征
- image: 图像数据
- label: 标签数据,包含以下类别:
- 0: AnnualCrop
- 1: Forest
- 2: HerbaceousVegetation
- 3: Highway
- 4: Industrial
- 5: Pasture
- 6: PermanentCrop
- 7: Residential
- 8: River
- 9: SeaLake
- choices: 整数类型
- prices: 整数类型
数据分割
- train: 训练集,包含21600个样本,大小为73997723.2字节
- test: 测试集,包含2700个样本,大小为9241099.7字节
- valid: 验证集,包含2700个样本,大小为9232043.9字节
数据集大小
- download_size: 91992228字节
- dataset_size: 92470866.80000001字节
配置
- default: 默认配置,包含以下数据文件路径:
- train: data/train-*
- test: data/test-*
- valid: data/valid-*
搜集汇总
数据集介绍

构建方式
EuroSAT数据集构建于遥感影像分析领域,其数据源自Sentinel-2卫星采集的高分辨率多光谱图像。该数据集通过精心筛选覆盖欧洲不同地理区域的影像,并依据地表覆盖类型进行人工标注,形成了包含10个类别的分类体系。影像经过预处理,确保空间分辨率和光谱波段的一致性,最终划分为训练集、验证集和测试集,为模型训练与评估提供了结构化基础。
使用方法
在使用EuroSAT数据集时,研究者可借助HuggingFace平台直接加载数据,其标准化的图像与标签格式便于集成到深度学习框架中。数据集适用于监督学习任务,用户可通过划分好的训练、验证和测试集进行模型训练、调参及性能评估。该数据集常被用于遥感图像分类、土地覆盖制图及迁移学习研究,为相关领域提供了高质量的实验数据支撑。
背景与挑战
背景概述
EuroSAT数据集作为遥感影像分类领域的重要基准,由Helmholtz-Zentrum Dresden-Rossendorf(HZDR)的研究团队于2019年创建,旨在推动基于深度学习的土地覆盖与土地利用分类研究。该数据集包含来自Sentinel-2卫星的高分辨率多光谱图像,涵盖农业用地、森林、水体及人造建筑等十类典型地表特征,为地理信息系统、环境监测和城市规划提供了关键数据支撑。其构建不仅促进了计算机视觉技术在遥感分析中的应用,还显著提升了模型在复杂地理场景下的泛化能力,成为该领域算法评估与比较的核心资源。
当前挑战
EuroSAT数据集所针对的遥感影像分类任务面临多重挑战:地表类别的光谱与纹理特征高度相似,如农作物与植被的区分易受季节变化影响;影像中尺度差异显著,大型工业区与细小河流的识别需模型具备多尺度感知能力。在数据构建过程中,Sentinel-2卫星数据的预处理涉及大气校正与云掩膜等复杂步骤,确保图像质量的一致性成为关键;同时,类别标注依赖专业地理知识,需结合多源数据验证以减少主观误差,这些因素共同增加了数据集的构建难度与应用门槛。
常用场景
经典使用场景
在遥感图像分析领域,EuroSAT数据集以其高分辨率的卫星影像和精细的土地覆盖类别标注,成为土地覆盖分类任务中的经典基准。该数据集广泛应用于训练和评估深度学习模型,特别是卷积神经网络,以自动识别农业用地、森林、水体及人造建筑等十类地表特征。通过提供标准化的训练、验证和测试划分,它支持研究者系统性地比较不同算法的性能,推动了遥感图像智能解译技术的发展。
解决学术问题
EuroSAT数据集有效解决了遥感影像中土地覆盖自动分类的学术挑战,为缺乏大规模标注数据的领域提供了可靠资源。它助力研究者探索小样本学习、领域自适应及模型可解释性等前沿问题,降低了地理信息提取对人工标注的依赖。该数据集的意义在于促进了计算机视觉与地理信息科学的交叉融合,为全球环境监测、气候变化研究提供了数据基础,加速了遥感人工智能技术的理论创新与应用落地。
实际应用
在实际应用中,EuroSAT数据集支撑了农业资源管理、城市规划与环境监测等多个领域。例如,通过分类模型可实时监测作物生长状态,优化农田灌溉与施肥策略;在城市规划中,识别工业区与居民区分布有助于评估土地利用效率。此外,该数据集还可用于自然灾害评估,如洪水淹没区识别,为应急响应提供决策支持,体现了遥感技术在可持续发展中的关键价值。
数据集最近研究
最新研究方向
EuroSAT数据集作为遥感影像分类领域的重要基准,其最新研究聚焦于多模态学习与自监督预训练技术的融合。在气候变化监测与土地资源管理的背景下,研究者正探索如何结合卫星图像的光谱特征与地理空间上下文信息,以提升对农业用地、森林覆盖及城市扩张等场景的细粒度识别能力。当前热点包括利用对比学习框架增强模型对季节性植被变化的鲁棒性,以及将视觉Transformer架构适配于高分辨率遥感数据,以应对工业设施与自然水体交错的复杂分类任务。这些进展不仅推动了遥感智能解译的精度边界,也为全球可持续发展目标中的环境评估提供了可扩展的技术支撑。
以上内容由遇见数据集搜集并总结生成



