five

atr0p05/aegis-training-v2.1

收藏
Hugging Face2025-12-08 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/atr0p05/aegis-training-v2.1
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 task_categories: - text-generation language: - en tags: - revops - sales - revenue-operations - llm-training - aegis pretty_name: AEGIS RevOps Training Dataset v2.1 size_categories: - 1K<n<10K --- # AEGIS RevOps Training Dataset v2.1 High-quality, domain-focused training data for the AEGIS 3-tier Revenue Operations AI assistant. ## v2.1 Improvements (over v2) - ✅ **91% RevOps-relevant** (Main model) - up from 35% - ✅ **Removed domain dilution** - filtered generic math/reasoning - ✅ **Added router hard negatives** - 138 calibration examples - ✅ **Added RevOps anchors** - 40+ true-to-AEGIS examples - ✅ **Clean Voice model** - No `<think>` blocks in voice data ## Dataset Description ### Router (0.5B Model) - **Purpose**: Intent classification for routing queries - **Intents**: voice_simple, crm_lookup, complex_analysis, action_confirm, fallback - **Features**: Hard negatives for calibration ### Voice (7B Model) - **Purpose**: Quick, conversational responses - **Focus**: Concise RevOps answers for voice/chat - **Features**: No chain-of-thought (clean output) ### Main (72B Model) - **Purpose**: Complex analysis with chain-of-thought reasoning - **Features**: All responses include `<think>...</think>` blocks - **Note**: Strip `<think>` at inference for clean user output ## Usage ```python from datasets import load_dataset # Load router training data router = load_dataset("atr0p05/aegis-training-v2.1", data_dir="router") # Load main model training data main = load_dataset("atr0p05/aegis-training-v2.1", data_dir="main") # Load voice model training data voice = load_dataset("atr0p05/aegis-training-v2.1", data_dir="voice") ``` ## <think> Block Policy The Main model uses `<think>...</think>` blocks for chain-of-thought reasoning: ``` <think> [Internal reasoning here] </think> [User-facing response here] ``` **Recommended inference approach:** - Train with `<think>` blocks (teaches reasoning) - Strip `<think>...</think>` at serving time - User sees only the response after `</think>` ## Topics Covered - Pipeline analysis and forecasting - Commission calculations - Win/loss analysis - Quota planning and territory management - Churn and retention analysis - Deal health scoring - Sales metrics and KPIs - CAC/LTV/NRR calculations - Ramp time optimization ## License Apache 2.0
提供机构:
atr0p05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作