dataformer/dolly-llama-qa

Name: dataformer/dolly-llama-qa
Creator: dataformer
Published: 2024-08-07 10:03:21
License: 暂无描述

Hugging Face2024-08-07 更新2025-04-26 收录

下载链接：

https://hf-mirror.com/datasets/dataformer/dolly-llama-qa

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: - en size_categories: - 1K<n<10K language_bcp47: - en-US configs: - config_name: llama_3.1_8B data_files: - split: train path: llama_3-1_8B_Instruct.jsonl - config_name: llama_3_8B data_files: - split: train path: llama_3_8B_Instruct.jsonl task_categories: - text-generation - question-answering tags: - synthetic --- # Dataset Card for dolly-llama-qa  This dataset has been created with [dataformer](https://github.com/DataformerAI/dataformer). ## Dataset Details ### Dataset Description  The dolly-llama-qa dataset is a synthetic QA pair dataset created using the context from [databricks-dolly-15k](https://huggingface.co/datasets/databricks/databricks-dolly-15k). We used [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) and [Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct) models for the generation and evolution part. Openai's gpt-4o was used for evaluating the refined questions and refined answers. ## Dataset Columns  * `context`: the context taken from [databricks-dolly-15k](https://huggingface.co/datasets/databricks/databricks-dolly-15k) * `seed_question`: the seed question generated from the context using the instruct model. * `refined_question`: the question after evolving the seed question using the instruct model. * `initial_answer`: the answer to the refined_question generated using the instruct model * `refined_answer`: the answer after evolution of the initial answer using the instruct model. * `question_quality`: the quality of the question on a scale of 1-10 evaluated using gpt-4o * `answer_quality`: the quality of the answer on a scale of 1-10 evaluated using gpt-4o

提供机构：

dataformer

5,000+

优质数据集

54 个

任务类型

进入经典数据集