atr0p05/aegis-training-v2.1
收藏Hugging Face2025-12-08 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/atr0p05/aegis-training-v2.1
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- text-generation
language:
- en
tags:
- revops
- sales
- revenue-operations
- llm-training
- aegis
pretty_name: AEGIS RevOps Training Dataset v2.1
size_categories:
- 1K<n<10K
---
# AEGIS RevOps Training Dataset v2.1
High-quality, domain-focused training data for the AEGIS 3-tier Revenue Operations AI assistant.
## v2.1 Improvements (over v2)
- ✅ **91% RevOps-relevant** (Main model) - up from 35%
- ✅ **Removed domain dilution** - filtered generic math/reasoning
- ✅ **Added router hard negatives** - 138 calibration examples
- ✅ **Added RevOps anchors** - 40+ true-to-AEGIS examples
- ✅ **Clean Voice model** - No `<think>` blocks in voice data
## Dataset Description
### Router (0.5B Model)
- **Purpose**: Intent classification for routing queries
- **Intents**: voice_simple, crm_lookup, complex_analysis, action_confirm, fallback
- **Features**: Hard negatives for calibration
### Voice (7B Model)
- **Purpose**: Quick, conversational responses
- **Focus**: Concise RevOps answers for voice/chat
- **Features**: No chain-of-thought (clean output)
### Main (72B Model)
- **Purpose**: Complex analysis with chain-of-thought reasoning
- **Features**: All responses include `<think>...</think>` blocks
- **Note**: Strip `<think>` at inference for clean user output
## Usage
```python
from datasets import load_dataset
# Load router training data
router = load_dataset("atr0p05/aegis-training-v2.1", data_dir="router")
# Load main model training data
main = load_dataset("atr0p05/aegis-training-v2.1", data_dir="main")
# Load voice model training data
voice = load_dataset("atr0p05/aegis-training-v2.1", data_dir="voice")
```
## <think> Block Policy
The Main model uses `<think>...</think>` blocks for chain-of-thought reasoning:
```
<think>
[Internal reasoning here]
</think>
[User-facing response here]
```
**Recommended inference approach:**
- Train with `<think>` blocks (teaches reasoning)
- Strip `<think>...</think>` at serving time
- User sees only the response after `</think>`
## Topics Covered
- Pipeline analysis and forecasting
- Commission calculations
- Win/loss analysis
- Quota planning and territory management
- Churn and retention analysis
- Deal health scoring
- Sales metrics and KPIs
- CAC/LTV/NRR calculations
- Ramp time optimization
## License
Apache 2.0
提供机构:
atr0p05



