five

Andresckamilo/distilset2

收藏
Hugging Face2024-06-02 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/Andresckamilo/distilset2
下载链接
链接失效反馈
官方服务:
资源简介:
--- size_categories: n<1K dataset_info: features: - name: instruction dtype: string - name: completion dtype: string - name: meta struct: - name: category dtype: string - name: completion dtype: string - name: id dtype: int64 - name: input dtype: 'null' - name: motivation_app dtype: 'null' - name: prompt dtype: string - name: source dtype: string - name: subcategory dtype: string - name: model_names sequence: string - name: generations sequence: string splits: - name: train num_bytes: 38944 num_examples: 10 download_size: 41785 dataset_size: 38944 configs: - config_name: default data_files: - split: train path: data/train-* tags: - synthetic - distilabel - rlaif --- <p align="left"> <a href="https://github.com/argilla-io/distilabel"> <img src="https://raw.githubusercontent.com/argilla-io/distilabel/main/docs/assets/distilabel-badge-light.png" alt="Built with Distilabel" width="200" height="32"/> </a> </p> # Dataset Card for distilset2 This dataset has been created with [distilabel](https://distilabel.argilla.io/). ## Dataset Summary This dataset contains a `pipeline.yaml` which can be used to reproduce the pipeline that generated it in distilabel using the `distilabel` CLI: ```console distilabel pipeline run --config "https://huggingface.co/datasets/Andresckamilo/distilset2/raw/main/pipeline.yaml" ``` or explore the configuration: ```console distilabel pipeline info --config "https://huggingface.co/datasets/Andresckamilo/distilset2/raw/main/pipeline.yaml" ``` ## Dataset structure The examples have the following structure per configuration: <details><summary> Configuration: default </summary><hr> ```json { "completion": "Denote the number of chocolates each person has by the letter of their first name. We know that\nA = D + 12\nD = R + 6\nA = 2 * R\n\nThus, A = (R + 6) + 12 = R + 18\nSince also A = 2 * R, this means 2 * R = R + 18\nHence R = 18\nHence D = 18 + 6 = 24", "generations": [ "Given the problem, let\u0027s define:\n\n- \\( A \\) as the number of chocolates Arianna has.\n- \\( D \\) as the number of chocolates Danny has.\n- \\( R \\) as the number of chocolates Robbie has.\n\nAccording to the problem, we have the following relationships:\n\n1. \\( A = D + 12 \\) (Arianna has 12 chocolates more than Danny).\n2. \\( D = R + 6 \\) (Danny has 6 chocolates more than Robbie).\n3. \\( A = 2R \\) (Arianna has twice as many chocolates as Robbie).\n\nWe need to determine the number of chocolates Danny has, \\( D \\).\n\nFirst, let\u0027s express \\( A \\) in terms of \\( R \\) using the third relationship:\n\\[ A = 2R \\]\n\nNext, substitute \\( A = 2R \\) into the first relationship \\( A = D + 12 \\):\n\\[ 2R = D + 12 \\]\n\nNow we can solve for \\( D \\):\n\\[ D = 2R - 12 \\]\n\nWe also know from the second relationship that \\( D = R + 6 \\). So we can set the two expressions for \\( D \\) equal to each other:\n\\[ 2R - 12 = R + 6 \\]\n\nSolving for \\( R \\):\n\\[ 2R - R = 6 + 12 \\]\n\\[ R = 18 \\]\n\nNow that we have \\( R = 18 \\), we can find \\( D \\) using \\( D = R + 6 \\):\n\\[ D = 18 + 6 \\]\n\\[ D = 24 \\]\n\nThus, Danny has \\( 24 \\) chocolates.", "Let\u0027s denote the number of chocolates that Robbie has as R.\n\nAccording to the problem, Arianna has twice as many chocolates as Robbie, so Arianna has 2R chocolates.\n\nDanny has 6 chocolates more than Robbie, so Danny has R + 6 chocolates.\n\nArianna has 12 chocolates more than Danny, so we can set up the equation 2R = R + 6 + 12.\n\nSimplifying the equation, we get 2R = R + 18.\n\nSubtracting R from both sides, we get R = 18.\n\nSo, Danny has R + 6 = 18 + 6 = 24 chocolates.\n\nThe answer is: 24", null ], "instruction": "Arianna has 12 chocolates more than Danny. Danny has 6 chocolates more than Robbie. Arianna has twice as many chocolates as Robbie has. How many chocolates does Danny have?", "meta": { "category": "Question Answering", "completion": "Denote the number of chocolates each person has by the letter of their first name. We know that\nA = D + 12\nD = R + 6\nA = 2 * R\n\nThus, A = (R + 6) + 12 = R + 18\nSince also A = 2 * R, this means 2 * R = R + 18\nHence R = 18\nHence D = 18 + 6 = 24", "id": 0, "input": null, "motivation_app": null, "prompt": "Arianna has 12 chocolates more than Danny. Danny has 6 chocolates more than Robbie. Arianna has twice as many chocolates as Robbie has. How many chocolates does Danny have?", "source": "surge", "subcategory": "Math" }, "model_names": [ "gpt-4o", "deepseek-chat", null ] } ``` This subset can be loaded as: ```python from datasets import load_dataset ds = load_dataset("Andresckamilo/distilset2", "default") ``` Or simply as it follows, since there's only one configuration and is named `default`: ```python from datasets import load_dataset ds = load_dataset("Andresckamilo/distilset2") ``` </details>
提供机构:
Andresckamilo
原始信息汇总

数据集卡片 for distilset2

数据集概述

该数据集包含一个 pipeline.yaml 文件,可以使用 distilabel CLI 在 distilabel 中重现生成该数据集的管道:

console distilabel pipeline run --config "https://huggingface.co/datasets/Andresckamilo/distilset2/raw/main/pipeline.yaml"

或者探索配置:

console distilabel pipeline info --config "https://huggingface.co/datasets/Andresckamilo/distilset2/raw/main/pipeline.yaml"

数据集结构

每个配置的示例具有以下结构:

<details><summary> 配置: default </summary><hr>

json { "completion": "Denote the number of chocolates each person has by the letter of their first name. We know that A = D + 12 D = R + 6 A = 2 * R

Thus, A = (R + 6) + 12 = R + 18 Since also A = 2 * R, this means 2 * R = R + 18 Hence R = 18 Hence D = 18 + 6 = 24", "generations": [ "Given the problem, letu0027s define:

  • ( A ) as the number of chocolates Arianna has.
  • ( D ) as the number of chocolates Danny has.
  • ( R ) as the number of chocolates Robbie has.

According to the problem, we have the following relationships:

  1. ( A = D + 12 ) (Arianna has 12 chocolates more than Danny).
  2. ( D = R + 6 ) (Danny has 6 chocolates more than Robbie).
  3. ( A = 2R ) (Arianna has twice as many chocolates as Robbie).

We need to determine the number of chocolates Danny has, ( D ).

First, letu0027s express ( A ) in terms of ( R ) using the third relationship: [ A = 2R ]

Next, substitute ( A = 2R ) into the first relationship ( A = D + 12 ): [ 2R = D + 12 ]

Now we can solve for ( D ): [ D = 2R - 12 ]

We also know from the second relationship that ( D = R + 6 ). So we can set the two expressions for ( D ) equal to each other: [ 2R - 12 = R + 6 ]

Solving for ( R ): [ 2R - R = 6 + 12 ] [ R = 18 ]

Now that we have ( R = 18 ), we can find ( D ) using ( D = R + 6 ): [ D = 18 + 6 ] [ D = 24 ]

Thus, Danny has ( 24 ) chocolates.", "Letu0027s denote the number of chocolates that Robbie has as R.

According to the problem, Arianna has twice as many chocolates as Robbie, so Arianna has 2R chocolates.

Danny has 6 chocolates more than Robbie, so Danny has R + 6 chocolates.

Arianna has 12 chocolates more than Danny, so we can set up the equation 2R = R + 6 + 12.

Simplifying the equation, we get 2R = R + 18.

Subtracting R from both sides, we get R = 18.

So, Danny has R + 6 = 18 + 6 = 24 chocolates.

The answer is: 24", null ], "instruction": "Arianna has 12 chocolates more than Danny. Danny has 6 chocolates more than Robbie. Arianna has twice as many chocolates as Robbie has. How many chocolates does Danny have?", "meta": { "category": "Question Answering", "completion": "Denote the number of chocolates each person has by the letter of their first name. We know that A = D + 12 D = R + 6 A = 2 * R

Thus, A = (R + 6) + 12 = R + 18 Since also A = 2 * R, this means 2 * R = R + 18 Hence R = 18 Hence D = 18 + 6 = 24", "id": 0, "input": null, "motivation_app": null, "prompt": "Arianna has 12 chocolates more than Danny. Danny has 6 chocolates more than Robbie. Arianna has twice as many chocolates as Robbie has. How many chocolates does Danny have?", "source": "surge", "subcategory": "Math" }, "model_names": [ "gpt-4o", "deepseek-chat", null ] }

该子集可以加载为:

python from datasets import load_dataset

ds = load_dataset("Andresckamilo/distilset2", "default")

或者简单地加载,因为只有一个配置并且命名为 default

python from datasets import load_dataset

ds = load_dataset("Andresckamilo/distilset2")

</details>

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作