gabrielmbmb/test
收藏Hugging Face2024-04-10 更新2024-06-11 收录
下载链接:
https://hf-mirror.com/datasets/gabrielmbmb/test
下载链接
链接失效反馈官方服务:
资源简介:
---
size_categories: n<1K
config_names:
- to_argilla
tags:
- synthetic
- distilabel
- rlaif
---
<p align="left">
<a href="https://github.com/argilla-io/distilabel">
<img src="https://raw.githubusercontent.com/argilla-io/distilabel/main/docs/assets/distilabel-badge-light.png" alt="Built with Distilabel" width="200" height="32"/>
</a>
</p>
# Dataset Card for test
This dataset has been created with [Distilabel](https://distilabel.argilla.io/).
## Dataset Summary
This dataset contains a `pipeline.yaml` which can be used to reproduce the pipeline that generated it in distilabel using the `distilabel` CLI:
```console
distilabel pipeline run --config "https://huggingface.co/datasets/gabrielmbmb/test/raw/main/pipeline.yaml"
```
or explore the configuration:
```console
distilabel pipeline info --config "https://huggingface.co/datasets/gabrielmbmb/test/raw/main/pipeline.yaml"
```
## Dataset structure
The examples have the following structure per configuration:
<details><summary> Configuration: to_argilla </summary><hr>
```json
{
"completion": "Denote the number of chocolates each person has by the letter of their first name. We know that\nA = D + 12\nD = R + 6\nA = 2 * R\n\nThus, A = (R + 6) + 12 = R + 18\nSince also A = 2 * R, this means 2 * R = R + 18\nHence R = 18\nHence D = 18 + 6 = 24",
"generation": "Let\u0027s use variables to represent the number of chocolates each person has:\n\nLet x be the number of chocolates Robbie has.\nThen Danny has x + 6 chocolates.\nAnd Arianna has x + 6 + 12 = x + 18 chocolates.\n\nGiven that Arianna has twice as many chocolates as Robbie, we can write the equation:\nx + 18 = 2x\n\nSolving for x:\nx = 18\n\nSo, Robbie has 18 chocolates, Danny has 18 + 6 = 24 chocolates.",
"instruction": "Arianna has 12 chocolates more than Danny. Danny has 6 chocolates more than Robbie. Arianna has twice as many chocolates as Robbie has. How many chocolates does Danny have?",
"meta": {
"category": "Question Answering",
"completion": "Denote the number of chocolates each person has by the letter of their first name. We know that\nA = D + 12\nD = R + 6\nA = 2 * R\n\nThus, A = (R + 6) + 12 = R + 18\nSince also A = 2 * R, this means 2 * R = R + 18\nHence R = 18\nHence D = 18 + 6 = 24",
"id": 0,
"input": null,
"motivation_app": null,
"prompt": "Arianna has 12 chocolates more than Danny. Danny has 6 chocolates more than Robbie. Arianna has twice as many chocolates as Robbie has. How many chocolates does Danny have?",
"source": "surge",
"subcategory": "Math"
},
"model_name": "gpt-3.5-turbo"
}
```
This subset can be loaded as:
```python
from datasets import load_dataset
ds = load_dataset("gabrielmbmb/test", "to_argilla")
```
</details>
提供机构:
gabrielmbmb
原始信息汇总
数据集概述
数据集基本信息
- 名称: test
- 创建工具: Distilabel
- 大小分类: n<1K
- 配置名称: to_argilla
- 标签: synthetic, distilabel, rlaif
数据集内容
- 结构: 包含一个
pipeline.yaml文件,用于在distilabel中重现生成此数据集的管道。 - 示例结构: 每个配置的示例包含
completion,generation,instruction,meta和model_name字段。
数据集使用
- 重现管道: 使用
distilabelCLI执行pipeline.yaml文件。 - 加载数据集: 通过
datasets库加载特定配置的数据集。
python from datasets import load_dataset
ds = load_dataset("gabrielmbmb/test", "to_argilla")



