Real-Scenario Multimodal Retrieval Dataset from Taobao / 淘宝电商多模态检索数据集
收藏阿里云天池2026-05-27 更新2024-03-07 收录
下载链接:
https://tianchi.aliyun.com/dataset/87809
下载链接
链接失效反馈官方服务:
资源简介:
With the rapid growth of worldwide retail e-commerce sales, the demand to facilitate an effective semantic understanding and retrieval of multimodal contents keeps emerging. In this dataset, we have prepared the real-scenario multimodal data from the mobile Taobao, one of the largest e-commerce platforms. The dataset consists of Taobao search queries and product image features, which are organized into a query-based multimodal retrieval task. The dataset is originally designed for "KDD Cup 2020 Challenges for Modern E-Commerce Platform: Multimodalities Recall ".
<br />
<br />
KDD Challenge: https://tianchi.aliyun.com/competition/entrance/231786/information
随着全球零售电商销售额的快速增长,对实现多模态内容(multimodal contents)高效语义理解与检索的需求日益凸显。本数据集取自全球最大电商平台之一的手机淘宝(mobile Taobao),收录了真实场景下的多模态数据。该数据集包含淘宝搜索查询词与商品图像特征,并被构建为基于查询的多模态检索任务(query-based multimodal retrieval task)。本数据集最初专为"KDD Cup 2020现代电商平台挑战赛:多模态召回"(KDD Cup 2020 Challenges for Modern E-Commerce Platform: Multimodalities Recall)设计。
KDD挑战赛相关链接:https://tianchi.aliyun.com/competition/entrance/231786/information
提供机构:
阿里云天池
创建时间:
2021-01-08
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集是淘宝电商平台提供的真实场景多模态检索数据集,包含约300万对搜索查询和产品图像特征,用于训练和评估多模态检索模型。数据集设计用于KDD Cup 2020挑战赛,重点评估模型在视觉和语言多模态理解方面的能力。
以上内容由遇见数据集搜集并总结生成



