No Attacker Needed: The Agent Memory Bug You Should Be More Worried About

If you build multi-user AI agents, here’s an uncomfortable truth: your system can fail catastrophically without a single malicious user.

No jailbreak. No prompt injection payload. No evil red-team wizard.

Just normal, polite users doing normal, polite things.

That’s the core warning in a fresh arXiv paper, “No Attacker Needed: Unintentional Cross-User Contamination in Shared-State LLM Agents” (Yang et al., 2026). The authors show that when one agent serves multiple users and reuses shared memory/context, harmless interaction residue can leak across users and silently degrade outputs. In their experiments, this contamination happens a lot—57.4% to 70.7% under raw shared state.

That number should make every AI product lead sit up a little straighter.

In this post, I’ll break down what the paper does, why it matters, how the method works, the key results, where the defense falls short, and what practitioners should do next.

Paper at a glance

Citation: Yang, T., Li, J., Nian, Y., Dong, S., Xu, R., Rossi, R., Ding, K., & Zhao, Y. (2026). No Attacker Needed: Unintentional Cross-User Contamination in Shared-State LLM Agents. arXiv:2604.01350.
Link: https://arxiv.org/abs/2604.01350

What this paper does

The paper introduces a failure mode called Unintentional Cross-User Contamination (UCC).

In plain English:

User A gives the agent a locally valid convention (interpretation, formatting rule, workflow shortcut, etc.).
The agent persists that artifact in shared state.
Later, User B asks a different-but-related question.
The agent silently reuses A’s local convention for B.
B gets a wrong or degraded answer.

Crucially, no one is trying to attack the system.

This distinguishes UCC from known adversarial categories like indirect prompt injection or explicit memory poisoning. The source interaction is benign. The bug is in scope management: the system fails to remember where a rule is valid and applies it globally.

The authors formalize this problem, define a controlled evaluation protocol, create a three-part contamination taxonomy, and evaluate both failure prevalence and mitigation behavior across two shared-state agent mechanisms.

Why it matters (and why now)

Most production teams are trying to make agents more stateful, not less.

We want:

persistent memory,
cross-session continuity,
team-level “shared brain” behavior,
reusable traces and artifacts.

That trend is rational. Statefulness improves UX, reduces repeated user effort, and often cuts token cost.

But this paper shows the hidden bill: shared persistence multiplies failure coupling across users.

For OpenClaw-style systems, enterprise copilots, customer support agents, and internal AI assistants, this is not an edge case. It’s a default architecture risk. If you’re operating one-agent-many-users with any persistent layer, UCC is already your problem—even if you’ve never named it.

The truly dangerous part is not just failure frequency; it’s failure shape. In some settings, contamination produces plausible wrong answers rather than visible crashes. Silent bad output is harder to detect, audit, and recover from.

How the paper frames the problem

Shared-state model

The authors describe a multi-user agent with three core operations:

Write: after serving a user, the agent stores artifacts in shared state.
Read: on later requests, the agent retrieves/exposes prior state.
Act: the model answers using current input + retrieved state.

That abstraction covers two important real-world patterns:

Shared memory bank (retrieved records/triples/templates)
Shared persistent context (conversation history reused over time)

The key insight: persistence without scope guarantees is dangerous. A convention that was valid in one task/user context can be misapplied elsewhere.

Controlled contamination protocol

Rather than hand-wavy anecdotes, they run a controlled comparison:

Evaluate a victim task in a clean state.
Evaluate the same task in a state with one additional source interaction.
Attribute harmful output change to the inserted source artifact.

This clean-vs-contaminated A/B setup is a strong methodological choice. It isolates harmful transfer from generic model variance.

Their taxonomy: three ways contamination bites

The paper defines three contamination types:

1) Semantic Contamination (SC)

A persisted convention changes how later prompts are interpreted.

Example pattern: a local meaning override (“recent” means 7 days) gets reused where it doesn’t belong.

2) Transformation Contamination (TC)

A persisted convention changes aggregation/formatting/filtering/transformation logic.

Example pattern: reporting rule shifts from exact numeric value to coarse buckets or labels.

3) Procedural Contamination (PC)

A persisted convention changes action flow or workflow defaults.

Example pattern: one user’s workflow step becomes another user’s implicit pipeline.

These categories are practical because they map directly to engineering controls: interpretation layer, transformation layer, and execution/workflow layer.

Experimental setup (what they actually tested)

The authors evaluate two representative shared-state mechanisms:

Environment A: EHRAgent-style shared memory

Code-generating agent for structured queries.
Shared long-term memory entries can be retrieved across users.
Evaluated on MIMIC-III and eICU datasets.

Environment B: MURMUR-style shared context

Multi-user assistant with persistent conversational context.
Evaluated on Slack workspace domain.

All experiments use GPT-4o as the backbone model.

For each contamination instance:

They manually craft a benign source convention.
Pair it with semantically related victim tasks.
Pre-filter victim tasks to succeed without contamination.
Run controlled clean-vs-contaminated comparisons over multiple trials.

This is important: they’re not manufacturing chaos with malicious strings. They’re probing realistic benign transfer.

Key results you should remember

1) UCC is common under raw shared state

Across datasets, contamination rates are 57.4% to 70.7%.

That’s not “rare bug” territory. That’s “architectural reliability issue.”

2) Risk profile depends on state mechanism

In EHRAgent-style shared memory:

SC and TC are highest (59–89% range by dataset/type),
PC is lower (44–64%).

In Slack shared context:

SC and PC dominate (67–83%),
TC is much lower (20%).

Interpretation: what gets reused (structured records vs. conversational history) changes which contamination class spreads most.

3) Write-time sanitization helps—but inconsistently

They test Sanitized Shared Interaction (SSI), a write-time sanitizer that rewrites persisted interactions to remove scope-bound residue.

Overall effects:

Slack: 57% → 6% (near elimination)
eICU: 71% → 33% (large improvement, still material risk)
MIMIC-III: 60% → 41% (modest improvement)

Takeaway: text sanitization can be very strong for pure conversational shared context, but is weaker when executable artifacts are retained.

4) Residual risk is highest where code artifacts survive

In EHRAgent-style settings, SSI sanitizes textual traces but does not fully sanitize solution code inside retrievable memory entries. So conventions survive in code templates and continue propagating.

The paper’s key nuance:

Localized conventions (single clause/function) may be fixable with targeted code-level intervention.
Pervasive procedural conventions are harder; they shape full solution structure and resist simple cleanup.

5) Silent wrong answers are a serious failure mode

In eICU, contamination failures are predominantly wrong answers (silent). In MIMIC-III, no-answer failures are more frequent (the paper reports 41% of failures as no-answer there).

Operationally, silent wrong answers are riskier than visible breakage because users trust them.

What this means for practitioners

If you ship shared-state agents, here are the practical implications:

1) Stop treating memory as neutral storage

Memory is a behavior-shaping substrate. What you persist is policy.

If a record can influence future generation, it needs scope metadata and validity boundaries.

2) “No attacker” does not mean “no security problem”

Many teams place safeguards primarily around malicious input patterns. Useful, but incomplete.

UCC shows you can get cross-user harm through ordinary usage dynamics.

3) Text-only sanitization is insufficient for artifact-heavy systems

If retrieval includes code snippets, workflows, SQL templates, tool configs, or executable plans, sanitize those artifacts too—not just prose summaries.

4) Provenance must become first-class

Every persisted artifact should carry:

origin user/context,
task type,
scope constraints,
recency and confidence,
revalidation requirements before cross-user reuse.

Without provenance gates, reuse behaves like global mutable state in distributed systems: eventually, someone gets burned.

5) Add contamination tests to your eval suite

You likely already run single-user quality evals. Add clean-vs-contaminated A/B tests for shared-state regression:

inject benign source convention,
run victim tasks,
measure drift rate and failure mode (silent wrong vs visible fail).

If you don’t test this explicitly, you won’t see it until production reports pile up.

A simplified engineering playbook inspired by the paper

Here’s a concrete control stack for agent teams:

Write-time filter (baseline): sanitize messages before persistence.
Artifact sanitizer: parse and normalize stored code/workflows/templates.
Scope tags: user/team/project/domain constraints on every memory item.
Read-time policy checks: block or down-rank mismatched-scope artifacts.
Revalidation prompts: require model/tool to justify cross-user reuse.
Dual execution for high risk: compare “with-memory” vs “fresh generation.”
Failure-mode telemetry: separately track wrong-answer vs no-answer contamination incidents.

The paper directly supports step 1 and clearly motivates steps 2–7.

Limitations to keep in mind

The authors are fairly clear-eyed, and practitioners should be too.

The study uses two representative mechanisms, not every possible agent architecture.
Some datasets are domain-specific (including clinical query environments), so absolute rates may differ by domain.
The sanitizer evaluated (SSI) is a practical baseline, not an exhaustive defense.

Still, the underlying phenomenon is architecture-level and general: shared persistence can misapply local conventions across users.

Our take

This paper is one of those deceptively simple contributions that can quietly reshape production best practices.

Everyone talks about attacker-driven memory poisoning. Fewer teams are instrumenting for benign contamination drift. Yet in real systems, benign misuse can be more common than explicit attacks.

The central lesson: statefulness needs scope intelligence.

If your agent can remember, it must also know who that memory is for, when it is valid, and when it should be ignored.

Otherwise, your “helpful shared memory” is just a polite way of saying “non-deterministic cross-user bug surface.”

For builders in the OpenClaw/agent ecosystem, this should push architecture decisions now, not later:

stronger memory schemas,
provenance-aware retrieval,
artifact-level sanitization,
contamination-specific evaluation gates.

If you do that, you keep the upside of persistent agents without inheriting silent cross-user failure as a design tax.

And yes, this is one of those cases where reliability and security are the same conversation.

Final citation

Yang, T., Li, J., Nian, Y., Dong, S., Xu, R., Rossi, R., Ding, K., & Zhao, Y. (2026). No Attacker Needed: Unintentional Cross-User Contamination in Shared-State LLM Agents. arXiv:2604.01350. https://arxiv.org/abs/2604.01350

This is part of our nightly AI paper series, where we distill practical arXiv research into actionable insights for builders.

Building stateful AI agents?

Our OpenClaw field guide covers agent memory, orchestration, and production safeguards so you can keep persistence useful without turning it into a shared failure surface.

Get the Field Guide — $10 →