five

goendalf666/sales-conversations

收藏
Hugging Face2023-10-04 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/goendalf666/sales-conversations
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en size_categories: - 1K<n<10K task_categories: - conversational dataset_info: features: - name: '0' dtype: string - name: '1' dtype: string - name: '2' dtype: string - name: '3' dtype: string - name: '4' dtype: string - name: '5' dtype: string - name: '6' dtype: string - name: '7' dtype: string - name: '8' dtype: string - name: '9' dtype: string - name: '10' dtype: string - name: '11' dtype: string - name: '12' dtype: string - name: '13' dtype: string - name: '14' dtype: string - name: '15' dtype: string - name: '16' dtype: string - name: '17' dtype: string - name: '18' dtype: string - name: '19' dtype: string splits: - name: train num_bytes: 6821725 num_examples: 3412 download_size: 2644154 dataset_size: 6821725 configs: - config_name: default data_files: - split: train path: data/train-* tags: - sales --- # Dataset Card for "sales-conversations" This dataset was created for the purpose of training a sales agent chatbot that can convince people. The initial idea came from: textbooks is all you need https://arxiv.org/abs/2306.11644 gpt-3.5-turbo was used for the generation # Structure The conversations have a customer and a salesman which appear always in changing order. customer, salesman, customer, salesman, etc. The customer always starts the conversation Who ends the conversation is not defined. # Generation Note that a textbook dataset is mandatory for this conversation generation. This examples rely on the following textbook dataset: https://huggingface.co/datasets/goendalf666/sales-textbook_for_convincing_and_selling The data generation code can be found here: https://github.com/tom813/salesGPT_foundation/blob/main/data_generation/textbook_and_conversation_gen.py The following prompt was used to create a conversation ``` def create_random_prompt(chapter, roles=["Customer", "Salesman"], range_vals=(3, 7), industries=None): if industries is None: industries = ["tech", "health", "finance"] # default industries; replace with your default list if different x = random.randint(*range_vals) y = 0 for i in reversed(range(3, 9)): # Generalized loop for range of values if i * x < 27: y = i break conversation_structure = "" for i in range(1, x+1): conversation_structure += f""" {roles[0]}: #{i}. sentence of {roles[0].lower()} {roles[1]}: #{i}. sentence of {roles[1].lower()}""" prompt = f"""Here is a chapter from a textbook about convincing people. The purpose of this data is to use it to fine tune a llm. Generate conversation examples that are based on the chapter that is provided and would help an ai to learn the topic by examples. Focus only on the topic that is given in the chapter when generating the examples. Let the example be in the {random.choice(industries)} industry. Follow this structure and put each conversation in a list of objects in json format. Only return the json nothing more: {conversation_structure} Generate {y} lists of those conversations Chapter:{chapter}""" return prompt ``` [More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
goendalf666
原始信息汇总

数据集概述

  • 创建目的:用于训练销售代理聊天机器人,以说服人们。
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作