MARA: A Malware Analysis Reasoning Agent for Interpretable Android Malware Detection

Figshare2025-05-25 更新2026-04-28 收录

下载链接：

https://figshare.com/articles/dataset/FAMDA_Fusion-based_Android_Malware_Detection_Agent_with_LLM_Support/29146082

下载链接

链接失效反馈

官方服务：

资源简介：

MARA is a next-generation Android malware detection framework that transforms fragmented static and behavioral signals into coherent, human-understandable malicious behavior chains. Unlike traditional black-box detectors or feature-centric learning models, MARA treats malware detection as a behavior-centric reasoning problem, powered by structured perception and multi-stage LLM reasoning.MARA introduces a unified perception–reasoning–action pipeline that enables transparent, explainable, and semantically grounded Android malware analysis, offering both high detection accuracy and strong interpretability.🔍 Key Features1. Behavior-Centric Evidence Structuring (BCES)MARA reorganizes heterogeneous Android artifacts—permissions, API calls, components, ICC flows, and lightweight runtime signals—into a structured, behavior-oriented evidence space.This design eliminates semantic fragmentation and exposes hidden relationships across signals such as:sensitive permission + data-access APIexported component + privilege operationbackground tasks + network exfiltrationBCES builds the foundation for coherent, chain-based reasoning.2. Multi-Stage Behavior Reasoning (BCMR)Instead of producing a single-pass prediction, MARA performs progressive reasoning using an LLM:Stage 1 — Initial InspectionIdentify suspicious behaviors at the evidence-block level.Stage 2 — Context EnrichmentInfer missing or implicit cross-block relationships.Stage 3 — Behavior-Chain ConstructionReconstruct the complete malicious behavior chain and make the final decision.This staged reasoning design enforces explicit, causal, and verifiable analysis—far more transparent than standard CoT or one-shot LLM inference.3. Explanation-Based DetectionMARA outputs both:a final malware/benign decision, anda behavior-grounded explanation that mirrors its actual reasoning trajectoryThis ensures high interpretability and eliminates the problem of post-hoc “fabricated explanations” common in LLM detectors.📊 Performance HighlightsAcross benchmark datasets (Drebin, AMD, CICMalDroid), MARA delivers:97.3% accuracy on Drebin96.4% accuracy on AMD94.7% accuracy on CICMalDroidHighest explanation quality across clarity, semantic relevance, justification, and behavior-chain fidelityStrong robustness under obfuscation (renaming, packing, encryption)MARA consistently outperforms traditional static detectors, deep learning fusion models, and recent LLM-based malware analysis frameworks.🛡️ Robustness to ObfuscationMARA’s behavior-centric design allows it to remain stable under:symbol renamingstring/code encryptionDEX packingNOP insertionAccuracy degradation is 2.5–4.0%, significantly lower than existing baselines (5–10%).

创建时间：

2025-05-25

5,000+

优质数据集

54 个

任务类型

进入经典数据集