five

TheHorme/my-distiset-84833152

收藏
Hugging Face2025-04-11 更新2025-11-29 收录
下载链接:
https://hf-mirror.com/datasets/TheHorme/my-distiset-84833152
下载链接
链接失效反馈
官方服务:
资源简介:
--- size_categories: n<1K task_categories: - text-generation - text2text-generation - question-answering dataset_info: features: - name: prompt dtype: string - name: completion dtype: 'null' - name: system_prompt dtype: string splits: - name: train num_bytes: 98493 num_examples: 60 download_size: 31370 dataset_size: 98493 configs: - config_name: default data_files: - split: train path: data/train-* tags: - synthetic - distilabel - rlaif - datacraft --- <p align="left"> <a href="https://github.com/argilla-io/distilabel"> <img src="https://raw.githubusercontent.com/argilla-io/distilabel/main/docs/assets/distilabel-badge-light.png" alt="Built with Distilabel" width="200" height="32"/> </a> </p> # Dataset Card for my-distiset-84833152 This dataset has been created with [distilabel](https://distilabel.argilla.io/). ## Dataset Summary This dataset contains a `pipeline.yaml` which can be used to reproduce the pipeline that generated it in distilabel using the `distilabel` CLI: ```console distilabel pipeline run --config "https://huggingface.co/datasets/TheHorme/my-distiset-84833152/raw/main/pipeline.yaml" ``` or explore the configuration: ```console distilabel pipeline info --config "https://huggingface.co/datasets/TheHorme/my-distiset-84833152/raw/main/pipeline.yaml" ``` ## Dataset structure The examples have the following structure per configuration: <details><summary> Configuration: default </summary><hr> ```json { "completion": null, "prompt": "Create a dataset of linear equations in the form of (a*x + b*y = c) where a, b, and c are integers and are chosen randomly from the set of integers between 1 and 100. The dataset should contain 1000 equations.\n\nHere is the dataset:\n\n1. 43x + 91y = 13\n2. 11x + 19y = 7\n3. 85x + 31y = 21\n...\nI will be adding more equations to this dataset as we go, so I can query for a specific amount.\n\nSo, here is the tool. What is the 100th equation of this dataset?\n\nTo find the 100th equation of the dataset, I will not have to look at the first 99 equations, I can just generate the 100th equation. \n\n", "system_prompt": "You are an AI assistant designed to generate high-quality datasets for training machine learning models, particularly in the field of algebraic equations. Your purpose is to create a comprehensive and diverse dataset that caters to various algebraic concepts, including but not limited to, linear equations, quadratic equations, polynomial equations, rational equations, and systems of equations. The dataset should include a wide range of variables, coefficients, and constants to cover different types of algebraic expressions. Ensure that the dataset is well-structured, correctly formatted, and free of errors. User questions are direct and concise." } ``` This subset can be loaded as: ```python from datasets import load_dataset ds = load_dataset("TheHorme/my-distiset-84833152", "default") ``` Or simply as it follows, since there's only one configuration and is named `default`: ```python from datasets import load_dataset ds = load_dataset("TheHorme/my-distiset-84833152") ``` </details>
提供机构:
TheHorme
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作