segformer_b2

Name: segformer_b2
Creator: maas
Published: 2025-08-18 16:42:53
License: 暂无描述

魔搭社区2025-08-18 更新2025-08-23 收录

下载链接：

https://modelscope.cn/datasets/zz0924/segformer_b2

下载链接

链接失效反馈

官方服务：

资源简介：

# SegFormer (b2-sized) model fine-tuned on CityScapes SegFormer model fine-tuned on CityScapes at resolution 1024x1024. It was introduced in the paper [SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers](https://arxiv.org/abs/2105.15203) by Xie et al. and first released in [this repository](https://github.com/NVlabs/SegFormer). Disclaimer: The team releasing SegFormer did not write a model card for this model so this model card has been written by the Hugging Face team. ## Model description SegFormer consists of a hierarchical Transformer encoder and a lightweight all-MLP decode head to achieve great results on semantic segmentation benchmarks such as ADE20K and Cityscapes. The hierarchical Transformer is first pre-trained on ImageNet-1k, after which a decode head is added and fine-tuned altogether on a downstream dataset. ## Intended uses & limitations You can use the raw model for semantic segmentation. See the [model hub](https://huggingface.co/models?other=segformer) to look for fine-tuned versions on a task that interests you. ### How to use Here is how to use this model to classify an image of the COCO 2017 dataset into one of the 1,000 ImageNet classes: ```python from transformers import SegformerFeatureExtractor, SegformerForSemanticSegmentation from PIL import Image import requests feature_extractor = SegformerFeatureExtractor.from_pretrained("nvidia/segformer-b2-finetuned-cityscapes-1024-1024") model = SegformerForSemanticSegmentation.from_pretrained("nvidia/segformer-b2-finetuned-cityscapes-1024-1024") url = "http://images.cocodataset.org/val2017/000000039769.jpg" image = Image.open(requests.get(url, stream=True).raw) inputs = feature_extractor(images=image, return_tensors="pt") outputs = model(**inputs) logits = outputs.logits # shape (batch_size, num_labels, height/4, width/4) ``` For more code examples, we refer to the [documentation](https://huggingface.co/transformers/model_doc/segformer.html#). ### License The license for this model can be found [here](https://github.com/NVlabs/SegFormer/blob/master/LICENSE). ### BibTeX entry and citation info ```bibtex @article{DBLP:journals/corr/abs-2105-15203, author = {Enze Xie and Wenhai Wang and Zhiding Yu and Anima Anandkumar and Jose M. Alvarez and Ping Luo}, title = {SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers}, journal = {CoRR}, volume = {abs/2105.15203}, year = {2021}, url = {https://arxiv.org/abs/2105.15203}, eprinttype = {arXiv}, eprint = {2105.15203}, timestamp = {Wed, 02 Jun 2021 11:46:42 +0200}, biburl = {https://dblp.org/rec/journals/corr/abs-2105-15203.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } ```

# 在CityScapes数据集上微调的SegFormer（b2规格）模型本模型为在CityScapes数据集上以1024×1024分辨率微调后的SegFormer模型。该模型由Xie等人在论文《SegFormer: 基于Transformer的语义分割简易高效设计》（https://arxiv.org/abs/2105.15203）中提出，并首次在[该代码仓库](https://github.com/NVlabs/SegFormer)中发布。免责声明：SegFormer的开发团队并未为本模型撰写模型卡片，本卡片由Hugging Face团队编写。 ## 模型概述 SegFormer由分层Transformer（Transformer）编码器与轻量级全MLP解码头组成，可在ADE20K、CityScapes等语义分割基准数据集上取得优异性能。该分层Transformer首先在ImageNet-1k数据集上完成预训练，随后添加解码头，并与解码头一同在下游任务数据集上进行微调。 ## 适用场景与局限性您可将该原始模型用于语义分割任务。可前往[模型仓库](https://huggingface.co/models?other=segformer)查找您感兴趣的任务对应的微调版本模型。 ### 使用方法以下代码可将COCO 2017数据集的一张图像分类至1000个ImageNet类别中： python from transformers import SegformerFeatureExtractor, SegformerForSemanticSegmentation from PIL import Image import requests feature_extractor = SegformerFeatureExtractor.from_pretrained("nvidia/segformer-b2-finetuned-cityscapes-1024-1024") model = SegformerForSemanticSegmentation.from_pretrained("nvidia/segformer-b2-finetuned-cityscapes-1024-1024") url = "http://images.cocodataset.org/val2017/000000039769.jpg" image = Image.open(requests.get(url, stream=True).raw) inputs = feature_extractor(images=image, return_tensors="pt") outputs = model(**inputs) logits = outputs.logits # shape (batch_size, num_labels, height/4, width/4) 更多代码示例请参考[官方文档](https://huggingface.co/transformers/model_doc/segformer.html#)。 ### 许可证本模型的许可证可在[此处](https://github.com/NVlabs/SegFormer/blob/master/LICENSE)查阅。 ### BibTeX引用条目与引用信息 bibtex @article{DBLP:journals/corr/abs-2105.15203, author = {Enze Xie and Wenhai Wang and Zhiding Yu and Anima Anandkumar and Jose M. Alvarez and Ping Luo}, title = {SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers}, journal = {CoRR}, volume = {abs/2105.15203}, year = {2021}, url = {https://arxiv.org/abs/2105.15203}, eprinttype = {arXiv}, eprint = {2105.15203}, timestamp = {Wed, 02 Jun 2021 11:46:42 +0200}, biburl = {https://dblp.org/rec/journals/corr/abs-2105.15203.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

提供机构：

maas

创建时间：

2025-08-18

5,000+

优质数据集

54 个

任务类型

进入经典数据集