goendalf666/sales-conversations

Name: goendalf666/sales-conversations
Creator: goendalf666
Published: 2023-10-04 20:39:04
License: 暂无描述

Hugging Face2023-10-04 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/goendalf666/sales-conversations

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: - en size_categories: - 1K<n<10K task_categories: - conversational dataset_info: features: - name: '0' dtype: string - name: '1' dtype: string - name: '2' dtype: string - name: '3' dtype: string - name: '4' dtype: string - name: '5' dtype: string - name: '6' dtype: string - name: '7' dtype: string - name: '8' dtype: string - name: '9' dtype: string - name: '10' dtype: string - name: '11' dtype: string - name: '12' dtype: string - name: '13' dtype: string - name: '14' dtype: string - name: '15' dtype: string - name: '16' dtype: string - name: '17' dtype: string - name: '18' dtype: string - name: '19' dtype: string splits: - name: train num_bytes: 6821725 num_examples: 3412 download_size: 2644154 dataset_size: 6821725 configs: - config_name: default data_files: - split: train path: data/train-* tags: - sales --- # Dataset Card for "sales-conversations" This dataset was created for the purpose of training a sales agent chatbot that can convince people. The initial idea came from: textbooks is all you need https://arxiv.org/abs/2306.11644 gpt-3.5-turbo was used for the generation # Structure The conversations have a customer and a salesman which appear always in changing order. customer, salesman, customer, salesman, etc. The customer always starts the conversation Who ends the conversation is not defined. # Generation Note that a textbook dataset is mandatory for this conversation generation. This examples rely on the following textbook dataset: https://huggingface.co/datasets/goendalf666/sales-textbook_for_convincing_and_selling The data generation code can be found here: https://github.com/tom813/salesGPT_foundation/blob/main/data_generation/textbook_and_conversation_gen.py The following prompt was used to create a conversation ``` def create_random_prompt(chapter, roles=["Customer", "Salesman"], range_vals=(3, 7), industries=None): if industries is None: industries = ["tech", "health", "finance"] # default industries; replace with your default list if different x = random.randint(*range_vals) y = 0 for i in reversed(range(3, 9)): # Generalized loop for range of values if i * x < 27: y = i break conversation_structure = "" for i in range(1, x+1): conversation_structure += f""" {roles[0]}: #{i}. sentence of {roles[0].lower()} {roles[1]}: #{i}. sentence of {roles[1].lower()}""" prompt = f"""Here is a chapter from a textbook about convincing people. The purpose of this data is to use it to fine tune a llm. Generate conversation examples that are based on the chapter that is provided and would help an ai to learn the topic by examples. Focus only on the topic that is given in the chapter when generating the examples. Let the example be in the {random.choice(industries)} industry. Follow this structure and put each conversation in a list of objects in json format. Only return the json nothing more: {conversation_structure} Generate {y} lists of those conversations Chapter:{chapter}""" return prompt ``` [More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)

提供机构：

goendalf666

原始信息汇总

数据集概述

创建目的：用于训练销售代理聊天机器人，以说服人们。

5,000+

优质数据集

54 个

任务类型

进入经典数据集