five

MARA: A Malware Analysis Reasoning Agent for Interpretable Android Malware Detection

收藏
Figshare2025-05-25 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/FAMDA_Fusion-based_Android_Malware_Detection_Agent_with_LLM_Support/29146082
下载链接
链接失效反馈
官方服务:
资源简介:
MARA is a next-generation Android malware detection framework that transforms fragmented static and behavioral signals into coherent, human-understandable malicious behavior chains. Unlike traditional black-box detectors or feature-centric learning models, MARA treats malware detection as a behavior-centric reasoning problem, powered by structured perception and multi-stage LLM reasoning.MARA introduces a unified perception–reasoning–action pipeline that enables transparent, explainable, and semantically grounded Android malware analysis, offering both high detection accuracy and strong interpretability.🔍 Key Features1. Behavior-Centric Evidence Structuring (BCES)MARA reorganizes heterogeneous Android artifacts—permissions, API calls, components, ICC flows, and lightweight runtime signals—into a structured, behavior-oriented evidence space.This design eliminates semantic fragmentation and exposes hidden relationships across signals such as:sensitive permission + data-access APIexported component + privilege operationbackground tasks + network exfiltrationBCES builds the foundation for coherent, chain-based reasoning.2. Multi-Stage Behavior Reasoning (BCMR)Instead of producing a single-pass prediction, MARA performs progressive reasoning using an LLM:Stage 1 — Initial InspectionIdentify suspicious behaviors at the evidence-block level.Stage 2 — Context EnrichmentInfer missing or implicit cross-block relationships.Stage 3 — Behavior-Chain ConstructionReconstruct the complete malicious behavior chain and make the final decision.This staged reasoning design enforces explicit, causal, and verifiable analysis—far more transparent than standard CoT or one-shot LLM inference.3. Explanation-Based DetectionMARA outputs both:a final malware/benign decision, anda behavior-grounded explanation that mirrors its actual reasoning trajectoryThis ensures high interpretability and eliminates the problem of post-hoc “fabricated explanations” common in LLM detectors.📊 Performance HighlightsAcross benchmark datasets (Drebin, AMD, CICMalDroid), MARA delivers:97.3% accuracy on Drebin96.4% accuracy on AMD94.7% accuracy on CICMalDroidHighest explanation quality across clarity, semantic relevance, justification, and behavior-chain fidelityStrong robustness under obfuscation (renaming, packing, encryption)MARA consistently outperforms traditional static detectors, deep learning fusion models, and recent LLM-based malware analysis frameworks.🛡️ Robustness to ObfuscationMARA’s behavior-centric design allows it to remain stable under:symbol renamingstring/code encryptionDEX packingNOP insertionAccuracy degradation is 2.5–4.0%, significantly lower than existing baselines (5–10%).
创建时间:
2025-05-25
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作