five

V1rtucious/Ecom-Chatbot-Test-Set

收藏
Hugging Face2026-03-18 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/V1rtucious/Ecom-Chatbot-Test-Set
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit language: - en tags: - e-commerce - chatbot - evaluation - synthetic - tool-calling - rag - escalation size_categories: - 1K<n<10K task_categories: - text-generation --- # Ecom Chatbot Synthetic Test Set A 2,000-sample fully synthetic test set for evaluating e-commerce chatbot models fine-tuned on [rescommons/Ecom-Chatbot-Finetuning-Dataset](https://huggingface.co/datasets/rescommons/Ecom-Chatbot-Finetuning-Dataset). Designed for zero-contamination evaluation — all products, orders, customer names, and responses are synthetically generated and do not overlap with the training data. ## Dataset Summary | Split | Samples | |-------|---------| | test | 2,000 | ## Group Distribution | Group | Count | Description | |-------|-------|-------------| | A | 667 | Tool-calling — order management (status, cancel, return, exchange, address, refund, reorder) | | B | 667 | RAG / Product info — product QA, similarity search, bundle suggestions, cross-sell, review QA | | C | 666 | Escalation / Edge cases — complaints, escalations, policy exceptions, repeat issues, edge cases | ## Schema Matches the original training dataset schema exactly: | Field | Type | Description | |-------|------|-------------| | `id` | string | Unique sample ID (`ecomm_XXXXXXXX`) | | `source` | string | Always `synthetic_v1` | | `group` | string | A, B, or C | | `difficulty` | int | Always 2 | | `system` | string | Aria system prompt | | `history` | list | Conversation history (empty for all samples) | | `prompt` | string | Customer message | | `context` | string | Retrieved context (product catalog / order data JSON) | | `tools` | string | JSON tool schemas (Group A only) | | `response_type` | string | `text`, `tool_call`, or `mixed` | | `response` | string | Aria's response | | `language` | string | Always `en` | | `locale` | string | Always `en-US` | | `annotator` | string | Always `synthetic_v1` | | `quality_score` | float | Always 0.91 | | `domain` | string | Product domain (office, baby, automotive, home, grocery, pets, sports_outdoors, electronics, fashion, general) | | `intent_category` | string | `order_management`, `product_discovery`, or `escalation` | | `intent` | string | Specific intent | | `sub_intent` | string | Sub-intent classification | ## Contamination Prevention - All product names, brands, and IDs are fully synthetic (NovaTech, AuraSound, TrailBlazer, etc.) - Customer names drawn from a diverse synthetic name pool - Order IDs and item IDs randomly generated - Response phrasing and structure is distinct from training data sources - Domains match the original dataset but product entries are novel
提供机构:
V1rtucious
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作