five

cngchis/Support-Ticket-Router-12K-Cleaned

收藏
Hugging Face2026-04-20 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/cngchis/Support-Ticket-Router-12K-Cleaned
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 task_categories: - text-classification - text-generation language: - en tags: - synthetic - customer-support - saas - intent-classification pretty_name: Support Ticket Router (12K Cleaned) size_categories: - 10K<n<100K --- # 🔥 Support-Ticket-Router-12K-Cleaned This dataset is a cleaned and structured version of real-world-like customer support messages designed for intent classification and routing tasks in SaaS / IT support systems. It is intended for training and evaluating LLM-based or classical NLP intent classifiers for automated customer support ticket routing. --- ## 🧪 Data Source This dataset is **synthetically generated using GPT-4-class models (GPT-4 / GPT-4o-style prompting)** with additional rule-based filtering and normalization. The data was created to simulate realistic customer support scenarios in SaaS / IT service environments, including: - API integration issues - Billing and payment problems - Subscription cancellation requests - Technical bugs and system failures - General customer complaints - Plan upgrade/downgrade requests --- ### 🧠 Generation Process - Prompts were designed to mimic real customer support tickets - Multiple variations of each intent were generated for diversity - Responses were normalized into 6 fixed intent classes - Noise and ambiguous samples were filtered out - Final dataset was manually and programmatically cleaned for consistency --- ### ⚠️ Notes - This dataset is **synthetic and not collected from real users** - It is intended for **research, benchmarking, and model training only** - No real customer or personal data is included --- ## 🧠 Summary - Total records: ~12,000+ - Task: Single-label Intent Classification - Domain: Customer Support AI (SaaS / IT Services) - Labels: - api - billing - cancellation - complaint - technical - upgrade --- ## 📌 Dataset Structure Each example follows a unified format: ```json { "text": "user message", "label": "intent label", "meta": { "source": "synthetic + real-world inspired", "domain": "SaaS customer support" } } ``` Or instruction-style format (for LLM fine-tuning): ```json { "input": "Classify the customer support message into one of the following intents: api, billing, cancellation, complaint, technical, upgrade.\n\nMessage: I want to cancel my subscription.", "output": "cancellation" } ``` --- ## 🧹 Cleaning Process This dataset has been carefully processed to improve label quality and consistency: - Removed ambiguous or noisy samples - Normalized intent categories into 6 standard labels - Filtered duplicate or near-duplicate entries - Standardized user message formatting - Balanced distribution across intent classes --- ## 📊 Dataset Statistics | Label | Description | |--------------|-------------| | api | API usage, integration, endpoint issues | | billing | Payment, invoice, pricing issues | | cancellation | Stop subscription, churn intent | | complaint | Dissatisfaction without clear category | | technical | Bugs, errors, system issues | | upgrade | Plan change, feature upgrade | --- ## ⚙️ Usage ```json from datasets import load_dataset dataset = load_dataset("cngchis/Support-Ticket-Router-12K-Cleaned") train = dataset["train"] test = dataset["test"] ``` --- ## 🚀 Use Cases This dataset can be used for: - Intent classification models - LLM fine-tuning (SFT / instruction tuning) - Customer support automation systems - Ticket routing systems in SaaS platforms - Benchmarking lightweight LLMs (GGUF / 4-bit models) --- ## 📚 Model Compatibility Works well with: - BERT / RoBERTa / DeBERTa - LLaMA / Mistral / Phi models - GGUF (llama.cpp) - Instruction-tuned LLMs --- ## 🧾 Citation If you use this dataset, please cite: ```json @misc{support_ticket_router_12k, title={Support-Ticket-Router-12K-Cleaned}, author={cngchis}, year={2026}, publisher={Hugging Face}, url={https://huggingface.co/datasets/cngchis/Support-Ticket-Router-12K-Cleaned} } ```
提供机构:
cngchis
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作