five

FooDI-ML

收藏
arXiv2022-08-26 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2110.02035v2
下载链接
链接失效反馈
官方服务:
资源简介:
FooDI-ML数据集是由格洛沃应用创建的大型多语言数据集,包含超过280万张食品、饮料和杂货的图像及其描述。数据集涵盖37个国家的33种语言,特别关注了东欧和西亚等地区较少代表的语言。数据集的创建旨在解决多语言环境下食品和饮料搜索引擎的效率问题,通过提供丰富的图像和文本数据,支持如图像标题生成、文本到图像生成和文本图像检索等多模态任务。此外,数据集还包括了训练/测试/验证分割和两个基准任务,以促进未来研究。

The FooDI-ML dataset is a large multilingual dataset created by Glovo App, containing over 2.8 million images of food, beverages and groceries along with their corresponding descriptive captions. The dataset covers 33 languages across 37 countries, with a particular focus on underrepresented languages in regions such as Eastern Europe and Western Asia. It was developed to address the efficiency issues of food and beverage search engines in multilingual environments, and by providing rich paired image and text data, it supports multimodal tasks including image captioning, text-to-image generation and text-image retrieval. Furthermore, the dataset includes train/test/validation splits and two benchmark tasks to facilitate future research.
提供机构:
格洛沃应用
创建时间:
2021-10-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作