Threat Intelligence 2026-06-13 9 min read

Vishing in 2026: Why Voice Phishing Is Now the Top Cloud Breach Vector

Google/Mandiant M-Trends 2026 ranks voice phishing the #2 initial breach vector worldwide — and #1 in cloud environments. AI-driven TOAD kits and deepfake meetings explain the surge, and why awareness training is the only defence that scales.

Source:Arsen — Why Vishing Has Become the Main Cyber Threat (29 Apr 2026), based on Google/Mandiant M-Trends 2026

Vishing in 2026: Why Voice Phishing Is Now the Top Cloud Breach Vector

For the first time, the Google/Mandiant M-Trends 2026 report opens not with ransomware or zero-days, but with voice phishing. Vishing is now the second most common way attackers gain initial access overall — and the single most common vector in cloud environments. Firewalls cannot intercept a phone call; the only durable control is a workforce trained to recognise the attack.

Why M-Trends 2026 Leads With Vishing

The M-Trends 2026 analysis is built on more than 500,000 hours of frontline incident response conducted by Mandiant during 2025, combined with Google Threat Intelligence Group telemetry. That a report of this scale opens on voice phishing — ahead of ransomware, exploits, and nation-state APTs — is the headline finding. As the report states: "We are tracking a significant shift toward voice-based social engineering (vishing), which has risen to the number two spot for initial infection vectors."

The shift is structural, not seasonal. Email phishing fell from 14% of intrusions in 2024 to just 6% in 2025, while vishing climbed to 11%. Email security infrastructure has matured — gateways, sandboxing, and link rewriting have made the inbox a harder target. A live human voice, patient and contextual, remains far harder to distrust. In cloud-specific compromises the imbalance is starker still: vishing accounts for 23% of incidents, surpassing stolen credentials (16%), email phishing (15%), and exploits (6%).

The 2025 Initial-Access Rankings

Mandiant's breakdown of confirmed 2025 intrusions shows vishing's rise alongside the decline of email phishing and stolen credentials. The report draws a sharp conceptual line between email phishing as a "non-interactive technical lure" and vishing as "interactive human engagement" — and that distinction is operational, because interactive attacks resist automated technical defences in ways static lures never did:

Exploits (CVEs) — 32%, stable and #1 for the sixth consecutive year
Voice phishing (vishing) — 11%, a significant increase and now the #2 vector
Prior compromise — ~10%, up from #5 in 2024
Stolen credentials — 9%, down from 16% in 2024
Web compromise — 8%, stable
Email phishing — 6%, down sharply from 14% in 2024
Insider threat — 6%, up from 5% in 2024

Google/Mandiant M-Trends 2026 — voice phishing ranked the #2 initial infection vector of 2025

Anatomy of an AI-Driven Vishing Attack

Phase 1

Lure Delivery — A notification with only a phone number

The attack begins with a TOAD (Telephone-Oriented Attack Delivery) message: a convincing brand notification — Google, Microsoft, Coinbase, or Binance — citing a locked account, a failed sign-in, or an unfamiliar login location. Crucially it contains no malicious link and no attachment, only a phone number. With nothing for a secure email gateway to detonate, the message sails through technical filtering and lands in the inbox.

Phase 2

Callback Routing — The victim dials in

When the target calls the embedded number, platforms like ATHR route the call to either a human operator or an AI voice agent running on Asterisk WebRTC infrastructure — the same telephony stack used by legitimate call centres. The victim has now initiated contact, which lowers their guard: in their mind, they called the company, not the other way around.

Phase 3

AI Social Engineering — The agent works a script

ATHR's AI vishing agents follow a ten-step structured methodology: they authenticate the callback, describe a fabricated account irregularity, build urgency, and walk the victim toward surrendering a six-digit verification code — entirely without a human operator. A single operator can run campaigns against multiple brands simultaneously, each call adapting to the victim's responses in real time.

Phase 4

Live Credential Harvesting — Captured mid-call

While the voice interaction continues, ATHR's phishing interfaces capture credentials instantly. Operators watch each target as a live session and redirect them to tailored pages during the call. The MFA code spoken aloud is replayed against the real service before it expires — the account is taken over while the victim is still on the line believing they are being helped.

Two Real-World Vishing Playbooks

Scenario 1 — The ATHR TOAD Kit Industrialises the Call

In April 2026, Abnormal AI researchers documented ATHR, a cybercrime platform sold on underground forums for $4,000 plus 10% of profits. It is the clearest demonstration to date of vishing being turned into a packaged product. ATHR consolidates the entire attack workflow into a single browser-based console, removing the need for an experienced social engineer and collapsing the cost of running high-volume campaigns:

Lure delivery — integrated mailers generate fraudulent brand notifications with adjustable personalization: lock timeframes, failed-login counts, last-access locations, and IP addresses
Callback routing — embedded numbers route victims to human operators or AI voice agents on Asterisk WebRTC infrastructure
AI-driven social engineering — a ten-step agent authenticates the callback, invents account irregularities, and extracts six-digit codes with no human in the loop
Live credential harvesting — phishing pages capture credentials in real time while operators monitor each victim as an active session
Multi-brand scale — a single operator runs simultaneous campaigns against Google, Microsoft, Coinbase, Binance and more, systematically targeting finance teams, help desks, and IT administrators

Scenario 2 — UNC1069: Fake Meetings, Voice Capture, and Deepfakes

Validin researchers detailed UNC1069 (overlapping with North Korea's Bluenoroff) in April 2026, targeting cryptocurrency, Web3, and financial-services organisations. Attackers pose as venture-capital professionals on LinkedIn and Telegram, often from compromised accounts, then send Calendly links to counterfeit video-conferencing platforms that mimic Zoom, Google Meet, and Microsoft Teams. Mid-call, they claim the victim's mic or camera is broken and push ClickFix-style prompts to run commands and "fix" the issue. The decisive twist: these fake meeting interfaces record the target's audio and video, which are then reused to impersonate them in later operations — including deepfakes of executives. The voice channel serves simultaneously as the attack vector and an intelligence-gathering tool.

Why Vishing Works When Email Phishing Fails

Both playbooks exploit the same structural advantage: trust in real-time human contact. Email phishing asks the target to click a link with uncertain consequences; vishing puts a patient, contextual voice on the line that adapts to every answer. Automated gateways can inspect an attachment — they cannot detect persuasive pretexting delivered at a measured pace in a familiar regional accent. As the research frames it, email phishing relies on "volume and opportunistic delivery", while interactive vishing involves "a live person, or now, an AI, steering the conversation in real-time."

Interactive, not static — a live voice adapts to hesitation and objections in ways a fixed email lure cannot
No payload to detect — TOAD messages carry no link and no attachment, so secure email gateways have nothing to detonate
Victim-initiated contact — the target dials the number, which disarms suspicion and frames the attacker as the trusted party
AI removes the skill barrier — voice agents and cloning let inexperienced operators run convincing, large-scale campaigns
Firewalls are blind to it — no perimeter control inspects a telephone call, so detection must move to the human

KEY TAKEAWAYS

1
Vishing is now the #2 initial infection vector overall and #1 in the cloud — voice phishing accounts for 11% of all 2025 intrusions and 23% of cloud compromises
2
Email phishing is collapsing as vishing rises — it fell from 14% of intrusions in 2024 to just 6% in 2025 as gateways hardened and attackers moved to the phone
3
AI has industrialised the call — kits like ATHR automate the entire conversation for $4,000, removing the need for skilled social engineers
4
Voice is now an intelligence target — actors like UNC1069 record victims in fake meetings to fuel deepfake impersonation later
5
Technical controls cannot stop a phone call — the only control that scales is a workforce trained against realistic voice attacks

How to Defend: Train People to Recognise the Call

Technical tooling stops known malware; it cannot stop a caller claiming to be IT and requesting CFO sign-off on a wire transfer because the executive is unreachable. The only sustainable defence is behavioural reinforcement through authentic simulation — exposing teams to the exact voice dynamics attackers use, in a safe environment, with feedback delivered at the moment of failure rather than in a distant quarterly session. An effective programme should include:

A scenario library covering IT help-desk impersonation, executive fraud, HR callbacks, vendor impersonation, and MFA-bypass pretexts that mirror real attacker approaches
Realistic voice options — standard AI voice, regional-accent customization, and voice-cloned executive impersonation to test resilience against the techniques UNC1069 already deploys
Spoofed caller ID that mirrors internal or known-vendor numbers, because generic numbers test a lower-risk scenario than attackers actually use
Real-time call monitoring and compliance measurement, with which departments comply, which roles disclose credentials under stress, and which pretexts succeed
Automatic post-call training delivered at the moment of failure, not weeks later
Management dashboards giving CISO-level risk visibility broken down by department, role, and location — and AI-powered simulation for campaigns of 100+ users or for testing voice-cloning and deepfake resilience

Key Takeaway

M-Trends 2026 confirms what frontline responders have felt for a year: the perimeter has moved to the human voice, and attackers — now armed with AI agents and voice cloning — have followed it there. Hardening the inbox simply pushed adversaries to the phone, where no firewall can follow. Organisations that treat vishing as a training problem, not a tooling problem, and that rehearse their people against realistic, AI-driven voice attacks, are the ones that will not be the next M-Trends statistic.

You cannot patch a phone call. When the attacker is a patient voice — or an AI imitating one your team already trusts — the only control that holds is a workforce that has heard the attack before and knows how to hang up.

Protect your executives from attacks like VENOM

Arsen provides AI-powered phishing simulations, QR code attack testing, and executive-specific training — exactly the defenses recommended against this campaign.

Explore Arsen

Related capabilities

Phishing & Vishing Simulation Security Assessments Incident Response

Vishing in 2026: Why Voice Phishing Is Now the Top Cloud Breach Vector

Why M-Trends 2026 Leads With Vishing

The 2025 Initial-Access Rankings

Anatomy of an AI-Driven Vishing Attack

Lure Delivery — A notification with only a phone number

Callback Routing — The victim dials in

AI Social Engineering — The agent works a script

Live Credential Harvesting — Captured mid-call

Two Real-World Vishing Playbooks

Scenario 1 — The ATHR TOAD Kit Industrialises the Call

Scenario 2 — UNC1069: Fake Meetings, Voice Capture, and Deepfakes

Why Vishing Works When Email Phishing Fails

KEY TAKEAWAYS

How to Defend: Train People to Recognise the Call

Key Takeaway

Protect your executives from attacks like VENOM

Related capabilities

SECURE YOUR
FUTURE TODAY

Book a 30-minute call

Why M-Trends 2026 Leads With Vishing

The 2025 Initial-Access Rankings

Anatomy of an AI-Driven Vishing Attack

Lure Delivery — A notification with only a phone number

Callback Routing — The victim dials in

AI Social Engineering — The agent works a script

Live Credential Harvesting — Captured mid-call

Two Real-World Vishing Playbooks

Scenario 1 — The ATHR TOAD Kit Industrialises the Call

Scenario 2 — UNC1069: Fake Meetings, Voice Capture, and Deepfakes

Why Vishing Works When Email Phishing Fails

KEY TAKEAWAYS

How to Defend: Train People to Recognise the Call

Key Takeaway

Protect your executives from attacks like VENOM

Related capabilities

SECURE YOURFUTURE TODAY

Book a 30-minute call

SECURE YOUR
FUTURE TODAY