five

DocReRank/ColHNQue-ColPaliHardNegativeQueries

收藏
Hugging Face2025-07-22 更新2025-11-01 收录
下载链接:
https://hf-mirror.com/datasets/DocReRank/ColHNQue-ColPaliHardNegativeQueries
下载链接
链接失效反馈
官方服务:
资源简介:
ColHNQue(ColPaliHardNegativeQueries)数据集是为了解决文档级别硬负样本挖掘的限制而引入的。它通过在页面/图像级别生成硬负查询来改进这一方法。对于给定的页面及其相应的正查询,数据集会生成多个在语义上相似但无法从该页面回答的负查询。该数据集包含与每个页面配对的图像、一个正查询(来自原始ColPali训练集)和每个页面的多个生成的负查询。它支持多模态检索和重排模型的训练,使检索增强生成(RAG)系统更加健壮和准确。

The ColHNQue (ColPaliHardNegativeQueries) dataset was introduced to address the limitations of document-level hard negative mining by generating hard negative queries at the page/image level. For a given page and its corresponding positive query, multiple negative queries are generated that are semantically similar but unanswerable from that page. The dataset consists of images paired with one positive query (taken from the original ColPali training set) and several generated negative queries per page. It supports the training of multi-modal retrieval and reranking models, enabling more robust and accurate Retrieval-Augmented Generation (RAG) systems.
提供机构:
DocReRank
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作