GitHub

On June 5, 404 Media reported that attackers used Meta's AI customer support agent to steal Instagram accounts. Their approach was simple: They asked the agent to link the accounts to email addresses they controlled, and the agent complied. Attacker broke into Obama's inactive White House account and posted pro-Iran messages; others took over accounts with valuable single-word IDs, perhaps with the aim of selling them.

AI cybersecurity concerns are not new. Since Anthropic announced in April that its Mythos model was too good at hacking to be made public, commentators, researchers and federal officials have focused on the idea that super-powered AI systems could devastate our IT infrastructure. That's not quite what that Instagram hack was: there, the AI was the target rather than the attacker, and the method was far simpler than anything Mythos could have concocted. But as companies hand over more work to AI, these relatively unsophisticated attacks could wreak havoc of their own.

“As AI is used more and more, especially as it is increasingly used to automate our workflows, such as account recovery, I think attackers will be increasingly motivated to attack the AI itself,” says Neil Gong, professor of electrical and computer engineering at Duke University.

Gong and other researchers have been warning about the security vulnerabilities of AI agents for some time. They publish articles and blog posts detailing exploits such as indirect prompt injection, which involves hijacking agents using commands hidden in websites, emails, or other seemingly innocuous data sources. Compared to these techniques, the Meta hack was practically insane. The only complication the hackers had to overcome was using a VPN that matched the location of the real account owner; then they directly asked the support agent to change the account email address, and he agreed.

Meta has not publicly commented on how this vulnerability slipped through the cracks. But given the simplicity of the exploit, Gong says, it should have been easily discovered before the agent was deployed. “It’s really surprising,” he said. “I don’t understand why they didn’t find this simple problem.”

Jessica Ji, a senior research analyst at Georgetown's Center for Security and Emerging Technology, agrees. “It raises questions like: Were there even guardrails in place? » she said. “Has anyone thought about testing this kind of scenario?” She notes that the oversight is particularly striking from a company like Meta, which has extensive expertise in both AI and cybersecurity. Meta did not respond to a request for comment for this article, but on Monday a Meta spokesperson told X that the vulnerability had been fixed.

As embarrassing as this may be for Meta in particular, it also highlights some fundamental vulnerabilities shared by all AI agents. Unlike traditional software, agents can react flexibly and unexpectedly to new circumstances, which is why they could replace human customer support agents. But AI agents can also be fooled in ways that humans wouldn't be, and because they can take actions in the real world, these mistakes have consequences. "A human would say, 'Okay, why do you want to change the email address?' " and maybe respond with a security question," says Somesh Jha, professor of computer science at the University of Wisconsin-Madison. "What happens with these agents is that they are very eager to complete the task. It's almost like an elementary school student who just wants to please the teacher."

There are ways to mitigate the risks. Companies can use traditional software to create guardrails that ensure agents follow strict rules, such as always requesting answers to security questions before sending sensitive account information to a new email address. And the experts consulted for this article all agree that agents must undergo rigorous red-teaming, a process in which developers do their best to attack a system to discover its vulnerabilities before deployment.

But there are also opposing forces. Companies want to deploy high-performance agents, and the more power an agent has (and the fewer safeguards it is subject to), the more work it can potentially accomplish. “Security and utility always require a tradeoff,” says Bo Li, a computer science professor at the University of Illinois at Urbana-Champaign. And a proper red team can be expensive. Defenders have to spend more resources than attackers because attackers only need to discover one exploit, while defenders try to discover and fix as many as they can. When attackers work on something as valuable as a one-word Instagram handle, they devote resources to finding exploits, so defenders have to spend even more money to protect that prize.

As AI models continue to improve, strengthening their defenses could become easier. Although the probabilistic nature of large language models means that LLM agents will always be vulnerable to some forms of attack, a more sophisticated model could have identified as suspicious an attempt to change the email address associated with Obama's White House account. And AI systems can be used for agent clustering, just as participants in Anthropic's Glasswing project use Mythos to identify vulnerabilities in their software.

Nonetheless, experts expect the problem of securing AI agents to become even more pressing in the future. As agents become more skilled, companies adopting them may want to give them more power, both to provide more services with fewer humans and to avoid being left behind by competitors. In the rapidly evolving world of AI, the time required to carefully secure at-risk agentic systems may seem like an unconscionable delay.

“Everyone wants to be the first to do something and get things done without scrutiny or red teaming,” says Jha. “I think it’s a very dangerous thing.”

Deep dive

Artificial intelligence

Want to understand the current state of AI? Check out these charts.

According to Stanford's AI Index 2026, AI is booming and we're struggling to keep up.

10 things that matter in AI today

MIT Technology Review's authoritative overview of the 10 technologies, emerging trends, bold ideas, and powerful movements in AI in 2026.

Musk vs. Altman Week 1: Elon Musk Says He Was Deceived, Warns AI Could Kill Us All, and Admits xAI Distills OpenAI's Models

Musk kept his cool and OpenAI's lawyer bombarded him with probing questions about his motives for suing the company.

Stay connected

Get the latest updates from

MIT Technology Review

Discover special offers, news, upcoming events and much more.

![The Metamorphology-Hack shows that AI security is about more than myth](https://wp.technologyreview.com/wp-content/uploads/2026/06/hand-keys1.jpg?resize=1200,600)