redis/langcache-sentencepairs-v1
收藏Hugging Face2025-12-19 更新2025-09-13 收录
下载链接:
https://hf-mirror.com/datasets/redis/langcache-sentencepairs-v1
下载链接
链接失效反馈官方服务:
资源简介:
这是一个大型、统一的英文句子对数据集,用于训练和评估语义相似度、检索和重新排序模型。它将广泛使用的基准数据集合并到一个统一的模式中,具有一致的字段和预制的分割。数据集由Redis策划,由Aditeya Baral共享。它包括各种特定来源的配置,如apt、mrpc、parade、paws、pit2015、qqp、sick和stsb。数据集主要用于句子对分类、释义检测和语义相似度评估等任务。它不适合多语言评估或未经校准的相似度回归任务。数据集在Apache-2.0许可下发布,可以通过Hugging Face平台访问。
A large, consolidated collection of English sentence pairs for training and evaluating semantic similarity, retrieval, and re-ranking models. It merges widely used benchmarks into a single schema with consistent fields and ready-made splits. Curated by Redis and shared by Aditeya Baral, it includes various source-specific configurations such as apt, mrpc, parade, paws, pit2015, qqp, sick, and stsb. The dataset is primarily used for tasks like sentence-pair classification, paraphrase detection, and semantic similarity evaluation. It is not suitable for multilingual evaluation or uncalibrated similarity regression tasks. The dataset is licensed under Apache-2.0 and can be accessed via the Hugging Face platform.
提供机构:
redis



