Autonomous adversary: red-teaming in the age of LLM

Name: Autonomous adversary: red-teaming in the age of LLM
Creator: National Research Council Canada
Published: 2026-04-28 17:42:38
License: 暂无描述

DataCite Commons2026-04-28 更新2026-05-04 收录

下载链接：

https://nrc-digital-repository.canada.ca/eng/view/object/?id=bbd70616-f648-41ac-a380-c96d0663c722

下载链接

链接失效反馈

官方服务：

资源简介：

Language Model Agents (LMAs) are emerging as a powerful primitive for augmenting red-team operations. They can support attack planning, adversary emulation, and the orchestration of multi-step activity such as lateral movement, a core enabling capability of advanced persistent threat (APT) campaigns. Using frameworks such as MITRE ATT&CK, we analyze where these agents intersect with core offensive functions and assess current strengths and limitations of LMAs with an emphasis on governance and realistic evaluation. We benchmark LMAs across two lateral‑movement scenarios in a controlled adversary‑emulation environment, where LMAs interact with instrumented cyber agents, observe execution artifacts, and iteratively adapt based on environmental feedback. Each scenario is formalized as an ordered task chain with explicit validation predicates, leveraging an LLM-as-a-Judge paradigm to ensure deterministic outcome verification. We compare three operational modalities: fully autonomous execution, self-scaffolded planning, and expert-defined action plan. Preliminary findings indicate that expert-defined action plans yield higher task-completion rates relative to other operational modes. However, failure remains frequent across all modalities, largely attributable to brittle command invocation, environmental and deployment instability, and recurring errors in credential management and state handling.

提供机构：

National Research Council Canada

创建时间：

2026-04-28

5,000+

优质数据集

54 个

任务类型

进入经典数据集