five

distilabel-internal-testing/embeddings-dataset-answer

收藏
Hugging Face2024-06-04 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/distilabel-internal-testing/embeddings-dataset-answer
下载链接
链接失效反馈
官方服务:
资源简介:
--- size_categories: n<1K tags: - synthetic - distilabel - rlaif --- <p align="left"> <a href="https://github.com/argilla-io/distilabel"> <img src="https://raw.githubusercontent.com/argilla-io/distilabel/main/docs/assets/distilabel-badge-light.png" alt="Built with Distilabel" width="200" height="32"/> </a> </p> # Dataset Card for embeddings-dataset-answer This dataset has been created with [distilabel](https://distilabel.argilla.io/). ## Dataset Summary This dataset contains a `pipeline.yaml` which can be used to reproduce the pipeline that generated it in distilabel using the `distilabel` CLI: ```console distilabel pipeline run --config "https://huggingface.co/datasets/distilabel-internal-testing/embeddings-dataset-answer/raw/main/pipeline.yaml" ``` or explore the configuration: ```console distilabel pipeline info --config "https://huggingface.co/datasets/distilabel-internal-testing/embeddings-dataset-answer/raw/main/pipeline.yaml" ``` ## Dataset structure The examples have the following structure per configuration: <details><summary> Configuration: default </summary><hr> ```json { "anchor": "Astrology: I am a Capricorn Sun Cap moon and cap rising...what does that say about me?", "distilabel_metadata": { "raw_output_generate_sentence_pair_0": "## Positive\n\nAs a triple Capricorn, you\u0027re likely to be an ambitious, disciplined, and responsible individual with a strong sense of duty and a natural flair for leadership, which can help you achieve great success in your personal and professional life.\n\n## Negative\n\nThe cap on my favorite pen has gone missing, and I\u0027m left struggling to find a suitable replacement." }, "model_name": "meta-llama/Meta-Llama-3-70B-Instruct", "negative": "The cap on my favorite pen has gone missing, and I\u0027m left struggling to find a suitable replacement.", "positive": "As a triple Capricorn, you\u0027re likely to be an ambitious, disciplined, and responsible individual with a strong sense of duty and a natural flair for leadership, which can help you achieve great success in your personal and professional life." } ``` This subset can be loaded as: ```python from datasets import load_dataset ds = load_dataset("distilabel-internal-testing/embeddings-dataset-answer", "default") ``` Or simply as it follows, since there's only one configuration and is named `default`: ```python from datasets import load_dataset ds = load_dataset("distilabel-internal-testing/embeddings-dataset-answer") ``` </details>
提供机构:
distilabel-internal-testing
原始信息汇总

数据集概述

数据集名称

  • 名称: embeddings-dataset-answer

数据集创建工具

数据集大小

  • 大小: n<1K

数据集标签

  • 标签:
    • synthetic
    • distilabel
    • rlaif

数据集结构

  • 结构:
    • 数据集包含一个pipeline.yaml文件,用于在distilabel中重现生成此数据集的流程。
    • 数据集示例结构如下: json { "anchor": "Astrology: I am a Capricorn Sun Cap moon and cap rising...what does that say about me?", "distilabel_metadata": { "raw_output_generate_sentence_pair_0": "## Positive

As a triple Capricorn, youu0027re likely to be an ambitious, disciplined, and responsible individual with a strong sense of duty and a natural flair for leadership, which can help you achieve great success in your personal and professional life.

Negative

The cap on my favorite pen has gone missing, and Iu0027m left struggling to find a suitable replacement." }, "model_name": "meta-llama/Meta-Llama-3-70B-Instruct", "negative": "The cap on my favorite pen has gone missing, and Iu0027m left struggling to find a suitable replacement.", "positive": "As a triple Capricorn, youu0027re likely to be an ambitious, disciplined, and responsible individual with a strong sense of duty and a natural flair for leadership, which can help you achieve great success in your personal and professional life." }

数据集加载

  • 加载方式:
    • 使用以下代码加载数据集: python from datasets import load_dataset

      ds = load_dataset("distilabel-internal-testing/embeddings-dataset-answer", "default")

    • 或简化为: python from datasets import load_dataset

      ds = load_dataset("distilabel-internal-testing/embeddings-dataset-answer")

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作