antonioloison/clean-colpali-dataset
收藏Hugging Face2025-04-04 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/antonioloison/clean-colpali-dataset
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包括两个部分:corpus和queries。corpus部分包含图像和与之相关的原始文本以及语言信息,适用于图像和文本相关任务。queries部分包含查询文本,正例 passages 索引,负例 passages 索引和原始文本信息,适用于文本检索或问答系统相关任务。数据集分为训练集和测试集(对于queries部分),提供了充足的示例以供模型训练和评估。
The dataset consists of two parts: corpus and queries. The corpus part includes images, associated original text, and language information, suitable for image and text-related tasks. The queries part includes query text, indices of positive passages, indices of negative passages, and original text information, suitable for text retrieval or question-answering system tasks. The dataset is split into training and test sets (for the queries part), providing a substantial number of examples for model training and evaluation.
提供机构:
antonioloison



