flickr30k

Name: flickr30k
Creator: maas
Published: 2026-05-22 00:00:49
License: 暂无描述

魔搭社区2026-05-22 更新2024-10-12 收录

下载链接：

https://modelscope.cn/datasets/lmms-lab/flickr30k

下载链接

链接失效反馈

官方服务：

资源简介：

<p align="center" width="100%"> <img src="https://i.postimg.cc/g0QRgMVv/WX20240228-113337-2x.png" width="100%" height="80%"> </p> # Large-scale Multi-modality Models Evaluation Suite > Accelerating the development of large-scale multi-modality models (LMMs) with `lmms-eval` 🏠 [Homepage](https://lmms-lab.github.io/) | 📚 [Documentation](docs/README.md) | 🤗 [Huggingface Datasets](https://huggingface.co/lmms-lab) # This Dataset This is a formatted version of [flickr30k](https://shannon.cs.illinois.edu/DenotationGraph/). It is used in our `lmms-eval` pipeline to allow for one-click evaluations of large multi-modality models. ``` @article{young-etal-2014-image, title = "From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions", author = "Young, Peter and Lai, Alice and Hodosh, Micah and Hockenmaier, Julia", editor = "Lin, Dekang and Collins, Michael and Lee, Lillian", journal = "Transactions of the Association for Computational Linguistics", volume = "2", year = "2014", address = "Cambridge, MA", publisher = "MIT Press", url = "https://aclanthology.org/Q14-1006", doi = "10.1162/tacl_a_00166", pages = "67--78", abstract = "We propose to use the visual denotations of linguistic expressions (i.e. the set of images they describe) to define novel denotational similarity metrics, which we show to be at least as beneficial as distributional similarities for two tasks that require semantic inference. To compute these denotational similarities, we construct a denotation graph, i.e. a subsumption hierarchy over constituents and their denotations, based on a large corpus of 30K images and 150K descriptive captions.", } ```

<p align="center" width="100%"> <img src="https://i.postimg.cc/g0QRgMVv/WX20240228-113337-2x.png" width="100%" height="80%"> </p> # 大规模多模态模型评测套件（Large-scale Multi-modality Models Evaluation Suite） > 借助`lmms-eval`加速大规模多模态模型（Large-scale Multi-modality Models, LMMs）的研发 🏠 [主页](https://lmms-lab.github.io/) | 📚 [文档](docs/README.md) | 🤗 [Huggingface 数据集](https://huggingface.co/lmms-lab) # 本数据集本数据集是[flickr30k](https://shannon.cs.illinois.edu/DenotationGraph/)的格式化版本，被应用于我们的`lmms-eval`评测流程中，可实现大规模多模态模型的一键式评测。 @article{young-etal-2014-image, title = "From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions", author = "Young, Peter and Lai, Alice and Hodosh, Micah and Hockenmaier, Julia", editor = "Lin, Dekang and Collins, Michael and Lee, Lillian", journal = "《计算语言学协会汇刊》（Transactions of the Association for Computational Linguistics）", volume = "2", year = "2014", address = "Cambridge, MA", publisher = "麻省理工大学出版社（MIT Press）", url = "https://aclanthology.org/Q14-1006", doi = "10.1162/tacl_a_00166", pages = "67--78", abstract = "我们提出利用语言表达式的视觉指称（即其描述的图像集合）来定义全新的指称相似度指标，实验表明，针对两类需要语义推理的任务，该指标的效果至少不输于分布相似度。为计算此类指称相似度，我们基于包含3万张图像与15万条描述性字幕的大型语料库，构建了指称图——即针对句法成分及其指称的包含层级结构。", }

提供机构：

maas

创建时间：

2024-10-06

搜集汇总

数据集介绍