tom-010/msmarcov2.1-binary-answerability
收藏Hugging Face2024-10-22 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/tom-010/msmarcov2.1-binary-answerability
下载链接
链接失效反馈官方服务:
资源简介:
该数据集用于训练一个二分类模型,以判断给定的文本段落是否回答了特定的用户查询。其目的是通过过滤掉不相关的段落,确保用户获得准确和有用的回答,从而提升搜索引擎的用户体验。数据集基于MS MARCO V2.1数据集构建,经过转换和重塑以适应二分类任务。数据集包含训练集和验证集,分别用于模型训练和评估。数据集的特征包括问题ID、问题文本、段落文本和二分类标签。数据集的用途是集成到搜索引擎中,过滤掉不相关的段落,提高搜索结果的质量和相关性。
This dataset is crafted for training a binary classification model to determine whether a given text passage answers a specific user query. Its primary purpose is to enhance our search engine by filtering out irrelevant passages, ensuring that users receive accurate and helpful responses to their questions. The dataset is based on the MS MARCO V2.1 dataset, transformed and reshaped to suit a binary classification task. It contains training and validation sets for model training and evaluation. The dataset features include question ID, question text, passage text, and a binary label. The intended use is to integrate the model into our search engine pipeline to filter out non-relevant passages, improving the overall quality and relevance of search results.
提供机构:
tom-010



