five

mukunda1729/agent-budget-violations

收藏
Hugging Face2026-04-27 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/mukunda1729/agent-budget-violations
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit language: - en tags: - agents - llm - observability - cost - testing - budgets size_categories: - n<1K configs: - config_name: default data_files: - split: train path: data.jsonl --- # agent-budget-violations 15 synthetic agent runs annotated with their **budget** (cost / tool-call / wall-time caps), **actual usage**, **violation types**, and a one-line **root cause + fix**. Built as fixtures for budget-enforcement tests, alerting heuristics, and observability dashboards. 5 of the 15 are clean (no violations) so you can test the "no false positive" path. ## Violation breakdown | Violation type | Count | |---|---| | `cost` | 4 | | `tool_calls` | 6 | | `wall_time` | 4 | | **None (clean)** | 5 | (Some runs violate multiple budgets — totals don't sum.) ## Schema ```jsonc { "id": "string", "agent": "string", "budget": { "max_tool_calls": 10, "max_cost_usd": 1.00, "max_wall_seconds": 60 }, "actual": { "tool_calls": 47, "cost_usd": 4.32, "wall_seconds": 312 }, "violation_types": ["tool_calls", "cost", "wall_time"], "root_cause": "string | null", "fix": "string | null" } ``` ## Common root causes covered - Infinite loops on tool errors - Slow third-party APIs - Model fallback to expensive tier - Off-by-one budget checks - Recursive task misinterpretation - LLM provider rate limits - Clarifying-question loops - Pagination explosion ## Quickstart ```python from datasets import load_dataset ds = load_dataset("mukunda1729/agent-budget-violations", split="train") multi_violators = [r for r in ds if len(r["violation_types"]) >= 2] print(f"{len(multi_violators)} multi-budget violations") ``` ## Related - [The Agent Reliability Stack](https://mukundakatta.github.io/agent-stack/) ## License MIT.
提供机构:
mukunda1729
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作