five

redis/langcache-triplets-v3

收藏
Hugging Face2025-12-19 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/redis/langcache-triplets-v3
下载链接
链接失效反馈
官方服务:
资源简介:
Redis LangCache Triplets Dataset v3是一个用于对比学习的大规模三元组数据集,包含锚点句子、语义相似的正面句子和不相似的负面句子。该数据集来源于Redis LangCache Sentence Pairs v3,结合了多个高质量的转述语料库。数据集主要用于训练句子编码器,适用于语义检索和重新排序等任务。数据集包含约8200万个三元组,全部为英文,采用Apache-2.0许可证。

Redis LangCache Triplets Dataset v3 is a large-scale triplet dataset for training sentence encoders using contrastive learning. Each example contains an anchor sentence, a semantically similar positive sentence, and a dissimilar negative sentence. The triplets are generated from the LangCache Sentence Pairs v3 dataset, which combines multiple high-quality paraphrase corpora. The dataset is primarily used for training sentence encoders and is suitable for tasks like semantic retrieval and re-ranking. It contains approximately 82 million triplets, all in English, and is licensed under Apache-2.0.
提供机构:
redis
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作