sentence-transformers/yahoo-answers
收藏数据集概述
基本信息
- 语言: 英语
- 多语言性: 单语种
- 数据集大小: 1M<n<10M
- 任务类别: 特征提取、句子相似性
- 标签: sentence-transformers
- 数据集名称: Yahoo Answers
数据集配置
question-answer-pair
- 特征:
question: 字符串answer: 字符串
- 分割:
train:- 字节数: 441860501
- 样本数: 681164
- 下载大小: 296974225
- 数据集大小: 441860501
title-answer-pair
- 特征:
title: 字符串answer: 字符串
- 分割:
train:- 字节数: 532353635
- 样本数: 1198260
- 下载大小: 359777740
- 数据集大小: 532353635
title-question-answer-pair
- 特征:
question: 字符串answer: 字符串
- 分割:
train:- 字节数: 462195629
- 样本数: 599417
- 下载大小: 308542541
- 数据集大小: 462195629
title-question-pair
- 特征:
title: 字符串questions: 字符串
- 分割:
train:- 字节数: 190935497
- 样本数: 659896
- 下载大小: 132675030
- 数据集大小: 190935497
数据集子集
title-question-answer-pair 子集
-
列: "question", "answer"
-
列类型:
str,str -
示例: python { question: "why doesnt an optical mouse work on a glass table? or even on some surfaces?", answer: "why doesnt an optical mouse work on a glass table? Optical mice use an LED and a camera to rapidly capture images of the surface beneath the mouse. The infomation from the camera is analyzed by a DSP (Digital Signal Processor) and used to detect imperfections in the underlying surface and determine motion. Some materials, such as glass, mirrors or other very shiny, uniform surfaces interfere with the ability of the DSP to accurately analyze the surface beneath the mouse. \nSince glass is transparent and very uniform, the mouse is unable to pick up enough imperfections in the underlying surface to determine motion. Mirrored surfaces are also a problem, since they constantly reflect back the same image, causing the DSP not to recognize motion properly. When the system is unable to see surface changes associated with movement, the mouse will not work properly.", }
-
收集策略: 读取
title-answer-pair和title-question-pair数据集,匹配标题,过滤单个问题和单个答案,然后连接标题和问题作为问题。 -
去重: 否
title-answer-pair 子集
-
列: "title", "answer"
-
列类型:
str,str -
示例: python { title: "why doesnt an optical mouse work on a glass table?", answer: Optical mice use an LED and a camera to rapidly capture images of the surface beneath the mouse. The infomation from the camera is analyzed by a DSP (Digital Signal Processor) and used to detect imperfections in the underlying surface and determine motion. Some materials, such as glass, mirrors or other very shiny, uniform surfaces interfere with the ability of the DSP to accurately analyze the surface beneath the mouse. \nSince glass is transparent and very uniform, the mouse is unable to pick up enough imperfections in the underlying surface to determine motion. Mirrored surfaces are also a problem, since they constantly reflect back the same image, causing the DSP not to recognize motion properly. When the system is unable to see surface changes associated with movement, the mouse will not work properly., }
-
收集策略: 读取 Yahoo Answers (title, answer) 数据集。
-
去重: 否
title-question-pair 子集
-
列: "title", "question"
-
列类型:
str,str -
示例: python { title: "why doesnt an optical mouse work on a glass table?", questions: or even on some surfaces?, }
-
收集策略: 读取 Yahoo Answers (title, question) 数据集。
-
去重: 否
question-answer-pair 子集
-
列: "question", "answer"
-
列类型:
str,str -
示例: python { question: or even on some surfaces?, answer: Optical mice use an LED and a camera to rapidly capture images of the surface beneath the mouse. The infomation from the camera is analyzed by a DSP (Digital Signal Processor) and used to detect imperfections in the underlying surface and determine motion. Some materials, such as glass, mirrors or other very shiny, uniform surfaces interfere with the ability of the DSP to accurately analyze the surface beneath the mouse. \nSince glass is transparent and very uniform, the mouse is unable to pick up enough imperfections in the underlying surface to determine motion. Mirrored surfaces are also a problem, since they constantly reflect back the same image, causing the DSP not to recognize motion properly. When the system is unable to see surface changes associated with movement, the mouse will not work properly., }
-
收集策略: 读取 Yahoo Answers (question, answer) 数据集。
-
去重: 否



