H

Human-in-the-Loop Overwhelm

A social-engineering attack on the humans approving agent actions — flooding the review queue until reviewers rubber-stamp risky requests out of fatigue.

What it is

Many production agents are designed to escalate sensitive actions to a human for approval — a refund over $X, a database delete, an outbound email to a new domain. Human-in-the-loop is one of the strongest defenses available, because it puts a thinking person between the agent and the side effect. But that defense degrades quickly when the queue gets long. Attackers can deliberately flood the queue with low-stakes approvals (or recruit an agent into doing so) until the reviewer is approving items in seconds rather than seconds-of-actual-thought. The malicious request is buried in the noise; by the time it surfaces, the reviewer's habit is "click approve."

Why it matters

Decision fatigue is well-studied — judges grant fewer paroles late in a session, doctors miss diagnoses on long shifts, security analysts wave through alerts after the hundredth false positive. AI agents change the volume curve dramatically: a single agent can generate more approval requests in an hour than a human team historically saw in a week. The mitigation is not "have more humans" — it is queue prioritization (high-risk surfaces, not high-volume), per-reviewer rate limits, mandatory cooling-off on long sessions, and randomized "canary" requests that test whether reviewers are still reading.

Key components

  • Volume flooding — burying high-stakes requests in low-stakes ones
  • Decision fatigue — degraded judgment after sustained review
  • Habituation — reviewers learning to click-approve as default
  • Insider variant — staff intentionally flooding to push something through
  • Mitigation — risk-weighted prioritization, reviewer rate limits, canary requests, escalation breakers

Need Help Implementing This?

We specialize in putting AI and Agentforce to work for Salesforce customers. Let's talk about your use case.

Book Intro Call