AgenticOps Engineering: Run Agents Like You Run Critical Enterprise Apps

Name: AgenticOps Engineering: Run Agents Like You Run Critical Enterprise Apps
Creator: Zenodo
Published: 2025-07-28 17:48:48
License: 暂无描述

Zenodo2025-07-28 更新2026-05-26 收录

下载链接：

https://zenodo.org/doi/10.5281/zenodo.16541022

下载链接

链接失效反馈

官方服务：

资源简介：

This paper introduces AgenticOps, a new operational discipline designed to support the deployment, governance, and lifecycle management of autonomous AI agents in production environments. As large language model (LLM)-driven agents move beyond experimental settings into enterprise-grade systems, traditional approaches from DevOps and MLOps are no longer sufficient. Unlike deterministic software or predictive models, agentic systems reason probabilistically, take autonomous actions, and interact dynamically with tools and users—raising new challenges in safety, reliability, observability, and control. The paper systematically defines the conceptual boundaries of AgenticOps and outlines its six foundational pillars: evaluation, guardrails, observability, security, optimization, and lifecycle management. It presents detailed frameworks for agent testing (e.g., scenario evals, behavioral evals), safety enforcement (e.g., multi-layered guardrails), and infrastructure observability (e.g., token traces, decision replay). Through industry case studies—including hallucinated ICD codes in medical agents and rogue behavior in financial tools—it demonstrates the urgency of robust operationalization for real-world deployments. AgenticOps is positioned as the critical next step in AI systems engineering, bridging the gap between prompt-tweaked prototypes and production-grade autonomous systems. The paper concludes with a call to action for open standards, shared behavioral protocols like SCAB, and interdisciplinary governance, arguing that treating agents with the same operational rigor as critical software is essential for AI safety and scalability

提供机构：

Zenodo

创建时间：

2025-07-28

5,000+

优质数据集

54 个

任务类型

进入经典数据集