five

sentence-transformers/natural-questions

收藏
Hugging Face2024-04-30 更新2024-06-15 收录
下载链接:
https://hf-mirror.com/datasets/sentence-transformers/natural-questions
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en multilinguality: - monolingual size_categories: - 100K<n<1M task_categories: - feature-extraction - sentence-similarity pretty_name: Natural Questions tags: - sentence-transformers dataset_info: config_name: pair features: - name: query dtype: string - name: answer dtype: string splits: - name: train num_bytes: 67154228 num_examples: 100231 download_size: 43995757 dataset_size: 67154228 configs: - config_name: pair data_files: - split: train path: pair/train-* --- # Dataset Card for Natural Questions This dataset is a collection of question-answer pairs from the Natural Questions dataset. See [Natural Questions](https://ai.google.com/research/NaturalQuestions) for additional information. This dataset can be used directly with Sentence Transformers to train embedding models. ## Dataset Subsets ### `pair` subset * Columns: "question", "answer" * Column types: `str`, `str` * Examples: ```python { 'query': 'the si unit of the electric field is', 'answer': 'Electric field An electric field is a field that surrounds electric charges. It represents charges attracting or repelling other electric charges by exerting force.[1] [2] Mathematically the electric field is a vector field that associates to each point in space the force, called the Coulomb force, that would be experienced per unit of charge, by an infinitesimal test charge at that point.[3] The units of the electric field in the SI system are newtons per coulomb (N/C), or volts per meter (V/m). Electric fields are created by electric charges, and by time-varying magnetic fields. Electric fields are important in many areas of physics, and are exploited practically in electrical technology. On an atomic scale, the electric field is responsible for the attractive force between the atomic nucleus and electrons that holds atoms together, and the forces between atoms that cause chemical bonding. The electric field and the magnetic field together form the electromagnetic force, one of the four fundamental forces of nature.', } ``` * Collection strategy: Reading the NQ train dataset from [embedding-training-data](https://huggingface.co/datasets/sentence-transformers/embedding-training-data). * Deduplified: No

--- 语言: - 英语(en) 多语言类型: - 单语言 样本规模类别: - 10万 < 样本数 < 100万 任务类别: - 特征提取 - 句子相似度 易读名称:自然问题(Natural Questions) 标签: - 句子转换器(sentence-transformers) 数据集信息: 配置名称:pair 特征: - 字段名:query(查询),数据类型:字符串 - 字段名:answer(答案),数据类型:字符串 数据划分: - 划分名称:训练集(train),字节大小:67154228,样本数量:100231 下载大小:43995757 数据集总大小:67154228 配置项: - 配置名称:pair 数据文件: - 划分:训练集(train),路径:pair/train-* --- # 自然问题(Natural Questions)数据集卡片 本数据集为自然问题(Natural Questions)数据集的问答对集合。如需获取更多详细信息,请参阅[自然问题(Natural Questions)](https://ai.google.com/research/NaturalQuestions)官方页面。 本数据集可直接配合句子转换器(sentence-transformers)用于训练嵌入模型。 ## 数据集子集 ### `pair` 子集 * 字段列:"question"(查询)、"answer"(答案) * 字段类型:字符串、字符串 * 示例: python { 'query': 'the si unit of the electric field is', 'answer': 'Electric field An electric field is a field that surrounds electric charges. It represents charges attracting or repelling other electric charges by exerting force.[1] [2] Mathematically the electric field is a vector field that associates to each point in space the force, called the Coulomb force, that would be experienced per unit of charge, by an infinitesimal test charge at that point.[3] The units of the electric field in the SI system are newtons per coulomb (N/C), or volts per meter (V/m). Electric fields are created by electric charges, and by time-varying magnetic fields. Electric fields are important in many areas of physics, and are exploited practically in electrical technology. On an atomic scale, the electric field is responsible for the attractive force between the atomic nucleus and electrons that holds atoms together, and the forces between atoms that cause chemical bonding. The electric field and the magnetic field together form the electromagnetic force, one of the four fundamental forces of nature.', } * 采集策略:从[embedding-training-data](https://huggingface.co/datasets/sentence-transformers/embedding-training-data)中读取自然问题训练集。 * 是否去重:否
提供机构:
sentence-transformers
原始信息汇总

数据集概述

基本信息

  • 语言: 英语
  • 多语言性: 单语种
  • 数据集大小: 100K<n<1M
  • 任务类别: 特征提取、句子相似性
  • 数据集名称: Natural Questions
  • 标签: sentence-transformers

数据集详情

  • 配置名称: pair
  • 特征:
    • query: 字符串类型
    • answer: 字符串类型
  • 分割:
    • train:
      • 字节数: 67154228
      • 样本数: 100231
  • 下载大小: 43995757
  • 数据集大小: 67154228

配置

  • 配置名称: pair
  • 数据文件:
    • 分割: train
    • 路径: pair/train-*

数据集子集

  • 子集名称: pair

  • : "question", "answer"

  • 列类型: 字符串, 字符串

  • 示例: python { query: the si unit of the electric field is, answer: Electric field An electric field is a field that surrounds electric charges. It represents charges attracting or repelling other electric charges by exerting force.[1] [2] Mathematically the electric field is a vector field that associates to each point in space the force, called the Coulomb force, that would be experienced per unit of charge, by an infinitesimal test charge at that point.[3] The units of the electric field in the SI system are newtons per coulomb (N/C), or volts per meter (V/m). Electric fields are created by electric charges, and by time-varying magnetic fields. Electric fields are important in many areas of physics, and are exploited practically in electrical technology. On an atomic scale, the electric field is responsible for the attractive force between the atomic nucleus and electrons that holds atoms together, and the forces between atoms that cause chemical bonding. The electric field and the magnetic field together form the electromagnetic force, one of the four fundamental forces of nature., }

  • 收集策略: 从embedding-training-data读取NQ训练数据集。

  • 去重: 否

搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
This dataset comprises question-answer pairs from the Natural Questions dataset, optimized for training sentence embedding models. It includes a 'pair' subset with straightforward text columns for questions and answers, facilitating direct use in embedding training workflows.
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作