segformer_b2
收藏魔搭社区2025-08-18 更新2025-08-23 收录
下载链接:
https://modelscope.cn/datasets/zz0924/segformer_b2
下载链接
链接失效反馈官方服务:
资源简介:
# SegFormer (b2-sized) model fine-tuned on CityScapes
SegFormer model fine-tuned on CityScapes at resolution 1024x1024. It was introduced in the paper [SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers](https://arxiv.org/abs/2105.15203) by Xie et al. and first released in [this repository](https://github.com/NVlabs/SegFormer).
Disclaimer: The team releasing SegFormer did not write a model card for this model so this model card has been written by the Hugging Face team.
## Model description
SegFormer consists of a hierarchical Transformer encoder and a lightweight all-MLP decode head to achieve great results on semantic segmentation benchmarks such as ADE20K and Cityscapes. The hierarchical Transformer is first pre-trained on ImageNet-1k, after which a decode head is added and fine-tuned altogether on a downstream dataset.
## Intended uses & limitations
You can use the raw model for semantic segmentation. See the [model hub](https://huggingface.co/models?other=segformer) to look for fine-tuned versions on a task that interests you.
### How to use
Here is how to use this model to classify an image of the COCO 2017 dataset into one of the 1,000 ImageNet classes:
```python
from transformers import SegformerFeatureExtractor, SegformerForSemanticSegmentation
from PIL import Image
import requests
feature_extractor = SegformerFeatureExtractor.from_pretrained("nvidia/segformer-b2-finetuned-cityscapes-1024-1024")
model = SegformerForSemanticSegmentation.from_pretrained("nvidia/segformer-b2-finetuned-cityscapes-1024-1024")
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = feature_extractor(images=image, return_tensors="pt")
outputs = model(**inputs)
logits = outputs.logits # shape (batch_size, num_labels, height/4, width/4)
```
For more code examples, we refer to the [documentation](https://huggingface.co/transformers/model_doc/segformer.html#).
### License
The license for this model can be found [here](https://github.com/NVlabs/SegFormer/blob/master/LICENSE).
### BibTeX entry and citation info
```bibtex
@article{DBLP:journals/corr/abs-2105-15203,
author = {Enze Xie and
Wenhai Wang and
Zhiding Yu and
Anima Anandkumar and
Jose M. Alvarez and
Ping Luo},
title = {SegFormer: Simple and Efficient Design for Semantic Segmentation with
Transformers},
journal = {CoRR},
volume = {abs/2105.15203},
year = {2021},
url = {https://arxiv.org/abs/2105.15203},
eprinttype = {arXiv},
eprint = {2105.15203},
timestamp = {Wed, 02 Jun 2021 11:46:42 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-2105-15203.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
# 在CityScapes数据集上微调的SegFormer(b2规格)模型
本模型为在CityScapes数据集上以1024×1024分辨率微调后的SegFormer模型。该模型由Xie等人在论文《SegFormer: 基于Transformer的语义分割简易高效设计》(https://arxiv.org/abs/2105.15203)中提出,并首次在[该代码仓库](https://github.com/NVlabs/SegFormer)中发布。
免责声明:SegFormer的开发团队并未为本模型撰写模型卡片,本卡片由Hugging Face团队编写。
## 模型概述
SegFormer由分层Transformer(Transformer)编码器与轻量级全MLP解码头组成,可在ADE20K、CityScapes等语义分割基准数据集上取得优异性能。该分层Transformer首先在ImageNet-1k数据集上完成预训练,随后添加解码头,并与解码头一同在下游任务数据集上进行微调。
## 适用场景与局限性
您可将该原始模型用于语义分割任务。可前往[模型仓库](https://huggingface.co/models?other=segformer)查找您感兴趣的任务对应的微调版本模型。
### 使用方法
以下代码可将COCO 2017数据集的一张图像分类至1000个ImageNet类别中:
python
from transformers import SegformerFeatureExtractor, SegformerForSemanticSegmentation
from PIL import Image
import requests
feature_extractor = SegformerFeatureExtractor.from_pretrained("nvidia/segformer-b2-finetuned-cityscapes-1024-1024")
model = SegformerForSemanticSegmentation.from_pretrained("nvidia/segformer-b2-finetuned-cityscapes-1024-1024")
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = feature_extractor(images=image, return_tensors="pt")
outputs = model(**inputs)
logits = outputs.logits # shape (batch_size, num_labels, height/4, width/4)
更多代码示例请参考[官方文档](https://huggingface.co/transformers/model_doc/segformer.html#)。
### 许可证
本模型的许可证可在[此处](https://github.com/NVlabs/SegFormer/blob/master/LICENSE)查阅。
### BibTeX引用条目与引用信息
bibtex
@article{DBLP:journals/corr/abs-2105.15203,
author = {Enze Xie and
Wenhai Wang and
Zhiding Yu and
Anima Anandkumar and
Jose M. Alvarez and
Ping Luo},
title = {SegFormer: Simple and Efficient Design for Semantic Segmentation with
Transformers},
journal = {CoRR},
volume = {abs/2105.15203},
year = {2021},
url = {https://arxiv.org/abs/2105.15203},
eprinttype = {arXiv},
eprint = {2105.15203},
timestamp = {Wed, 02 Jun 2021 11:46:42 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-2105.15203.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
提供机构:
maas
创建时间:
2025-08-18



