mteb/JinaVDRAirbnbSyntheticRetrieval
收藏Hugging Face2025-10-18 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/mteb/JinaVDRAirbnbSyntheticRetrieval
下载链接
链接失效反馈官方服务:
资源简介:
JinaVDRAirbnbSyntheticRetrieval是一个多语言视觉文档检索数据集,包含阿拉伯语、德语、英语、法语、印地语、匈牙利语、日语、俄语、西班牙语和中文的图像和文本信息。数据集分为corpus、qrels和queries三个部分,分别用于图像-文本检索、查询-文档相关性和查询文本。每个部分都有测试集,并提供了文件大小和示例数量等信息。
JinaVDRAirbnbSyntheticRetrieval is a multilingual visual-document retrieval dataset that includes image and text information in Arabic, German, English, French, Hindi, Hungarian, Japanese, Russian, Spanish, and Chinese. The dataset is divided into three main parts: corpus, qrels, and queries, used for image-text retrieval, query-document relevance, and query text respectively. Each part has a test set with file size and number of examples provided.
提供机构:
mteb



