five

atr0p05/aegis-training-v2

收藏
Hugging Face2025-12-08 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/atr0p05/aegis-training-v2
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 task_categories: - text-generation language: - en tags: - revops - sales - revenue-operations - llm-training - aegis pretty_name: AEGIS RevOps Training Dataset v2 size_categories: - 10K<n<100K --- # AEGIS RevOps Training Dataset v2 High-quality training data for the AEGIS 3-tier Revenue Operations AI assistant. ## Dataset Description This dataset contains training examples for three specialized models: ### Router (0.5B Model) - **Purpose**: Intent classification for routing queries - **Intents**: voice_simple, crm_lookup, complex_analysis, action_confirm, fallback - **Examples**: ~2,600 ### Voice (7B Model) - **Purpose**: Quick, conversational responses - **Focus**: Concise RevOps answers for voice/chat - **Examples**: ~4,700 ### Main (72B Model) - **Purpose**: Complex analysis with chain-of-thought reasoning - **Features**: All responses include `<think>...</think>` blocks - **Examples**: ~3,700 ## v2 Improvements - ✅ Removed near-duplicate examples (44% reduction) - ✅ Added multi-turn conversation examples - ✅ Expanded churn/retention topic coverage - ✅ 100% of Main examples have valid `<think>` blocks - ✅ Clean, normalized router labels ## Usage ```python from datasets import load_dataset # Load router training data router = load_dataset("atr0p05/aegis-training-v2", data_dir="router") # Load main model training data main = load_dataset("atr0p05/aegis-training-v2", data_dir="main") # Load voice model training data voice = load_dataset("atr0p05/aegis-training-v2", data_dir="voice") ``` ## Format All examples are in chat format: ```json { "messages": [ {"role": "system", "content": "..."}, {"role": "user", "content": "..."}, {"role": "assistant", "content": "..."} ] } ``` ## Topics Covered - Pipeline analysis and forecasting - Commission calculations - Win/loss analysis - Quota planning and territory management - Churn and retention analysis - Deal health scoring - Sales metrics and KPIs - Multi-turn conversations - Safety/edge cases ## License Apache 2.0
提供机构:
atr0p05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作