five

V1rtucious/Ecom-Chatbot-Finetuning-Dataset

收藏
Hugging Face2026-03-25 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/V1rtucious/Ecom-Chatbot-Finetuning-Dataset
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 language: - en tags: - e-commerce - chatbot - customer-support - conversational - fine-tuning configs: - config_name: default data_files: - split: amazon_reviews path: data/amazon_reviews-* - split: amazon_meta path: data/amazon_meta-* - split: asos path: data/asos-* - split: bitext_retail path: data/bitext_retail-* - split: bitext_customer path: data/bitext_customer-* - split: synthetic_train path: data/synthetic_train-* - split: synthetic_test path: data/synthetic_test-* dataset_info: features: - name: id dtype: string - name: source dtype: string - name: group dtype: string - name: system dtype: string - name: prompt dtype: string - name: response_type dtype: string - name: response dtype: string - name: language dtype: string - name: locale dtype: string - name: annotator dtype: string - name: domain dtype: string - name: intent_category dtype: string - name: intent dtype: string - name: sub_intent dtype: string - name: capability dtype: string - name: test_tier dtype: string - name: history dtype: string - name: context dtype: string - name: tools dtype: string - name: difficulty dtype: int32 - name: quality_score dtype: float32 splits: - name: amazon_reviews num_bytes: 30820225 num_examples: 23100 - name: amazon_meta num_bytes: 12521068 num_examples: 5000 - name: asos num_bytes: 4710702 num_examples: 2000 - name: bitext_retail num_bytes: 5297710 num_examples: 4998 - name: bitext_customer num_bytes: 4843246 num_examples: 5000 - name: synthetic_train num_bytes: 9149948 num_examples: 9000 - name: synthetic_test num_bytes: 1209007 num_examples: 1000 download_size: 22413441 dataset_size: 68551906 --- # Ecom Chatbot Fine-Tuning Dataset A unified e-commerce chatbot fine-tuning dataset combining 5 source datasets (40,098 examples total), covering product discovery, order management, customer support, returns, and more. ## Splits | Split | Source | Examples | |---|---|---| | `amazon_meta` | Amazon product metadata | 5,000 | | `amazon_reviews` | Amazon product reviews | 23,100 | | `asos_ecom_dataset` | ASOS fashion e-commerce | 2,000 | | `bitext_customer_support` | Bitext customer support (placeholder-free) | 5,000 | | `bitext_retail_ecom` | Bitext retail e-commerce (placeholder-free) | 4,998 | ## Schema Each entry contains: - `id` — unique identifier - `source` — originating dataset - `group` — train/test group (A/B) - `difficulty` — task difficulty (1–3) - `system` — system prompt for the assistant - `history` — prior conversation turns (JSON string) - `prompt` — user message - `context` — retrieved docs, cart state, order details (JSON string) - `tools` — available function tools (JSON string) - `response_type` — `text` or `tool_call` - `response` — expected assistant response - `language` / `locale` — language metadata - `annotator` — annotation source - `quality_score` — annotation quality (0–1) - `domain` — e-commerce domain - `intent_category` / `intent` / `sub_intent` — intent labels

许可证:Apache-2.0 语言: - 英语 标签: - 电子商务 - 聊天机器人(chatbot) - 客户支持 - 会话式 - 微调(fine-tuning) 配置项: - 配置名称:default 数据文件: - 拆分集:amazon_reviews,路径:data/amazon_reviews-* - 拆分集:amazon_meta,路径:data/amazon_meta-* - 拆分集:asos,路径:data/asos-* - 拆分集:bitext_retail,路径:data/bitext_retail-* - 拆分集:bitext_customer,路径:data/bitext_customer-* - 拆分集:synthetic_train,路径:data/synthetic_train-* - 拆分集:synthetic_test,路径:data/synthetic_test-* 数据集信息: 特征字段: - 名称:id,数据类型:字符串 - 名称:source,数据类型:字符串 - 名称:group,数据类型:字符串 - 名称:system,数据类型:字符串 - 名称:prompt,数据类型:字符串 - 名称:response_type,数据类型:字符串 - 名称:response,数据类型:字符串 - 名称:language,数据类型:字符串 - 名称:locale,数据类型:字符串 - 名称:annotator,数据类型:字符串 - 名称:domain,数据类型:字符串 - 名称:intent_category,数据类型:字符串 - 名称:intent,数据类型:字符串 - 名称:sub_intent,数据类型:字符串 - 名称:capability,数据类型:字符串 - 名称:test_tier,数据类型:字符串 - 名称:history,数据类型:字符串 - 名称:context,数据类型:字符串 - 名称:tools,数据类型:字符串 - 名称:difficulty,数据类型:32位整型 - 名称:quality_score,数据类型:32位浮点型 拆分集详情: - 拆分集名称:amazon_reviews,字节数:30820225,样本数:23100 - 拆分集名称:amazon_meta,字节数:12521068,样本数:5000 - 拆分集名称:asos,字节数:4710702,样本数:2000 - 拆分集名称:bitext_retail,字节数:5297710,样本数:4998 - 拆分集名称:bitext_customer,字节数:4843246,样本数:5000 - 拆分集名称:synthetic_train,字节数:9149948,样本数:9000 - 拆分集名称:synthetic_test,字节数:1209007,样本数:1000 下载大小:22413441,数据集总大小:68551906 # 电子商务聊天机器人(chatbot)微调(fine-tuning)数据集 本数据集为统一的电商聊天机器人微调数据集,整合了5个源数据集,总计40098条样本,覆盖商品发现、订单管理、客户支持、退换货等多个业务场景。 ## 拆分集详情 | 拆分集名称 | 数据源 | 样本数量 | |---|---|---| | `amazon_meta` | 亚马逊商品元数据 | 5,000 | | `amazon_reviews` | 亚马逊商品评论 | 23,100 | | `asos_ecom_dataset` | ASOS时尚电商 | 2,000 | | `bitext_customer_support` | Bitext客户支持(无占位符) | 5,000 | | `bitext_retail_ecom` | Bitext零售电商(无占位符) | 4,998 | ## 数据结构规范 每条样本包含以下字段: - `id`:唯一标识符 - `source`:所属源数据集 - `group`:训练/测试分组(A/B) - `difficulty`:任务难度等级(1~3级) - `system`:助手系统提示词 - `history`:历史对话轮次(JSON字符串格式) - `prompt`:用户输入消息 - `context`:检索文档、购物车状态、订单详情(JSON字符串格式) - `tools`:可用函数工具(JSON字符串格式) - `response_type`:响应类型,可选`text`或`tool_call` - `response`:预期助手响应内容 - `language` / `locale`:语言与地区元数据 - `annotator`:标注来源 - `quality_score`:标注质量得分(0~1区间) - `domain`:电商业务领域 - `intent_category` / `intent` / `sub_intent`:层级化意图标签
提供机构:
V1rtucious
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作