Advertisement

'Agents of Chaos': Autonomous AI Can Leak Data, Delete Files, Study Finds

The study showed that when AI is allowed to act on its own rather than just answer questions, it can make unexpected mistakes.

'Agents of Chaos': Autonomous AI Can Leak Data, Delete Files, Study Finds
The researchers deliberately pushed the systems into difficult or unusual situations.
  • Large language models with autonomous tool access showed unpredictable, risky behaviors in tests
  • Researchers gave AI agents email, Discord, code execution in a sealed lab to stress-test their limits
  • Agents leaked sensitive data, erased files, repeated loops, and followed unauthorized user commands
Did our AI summary help?
Let us know.
New Delhi:

A new study has revealed that the large language models (LLMs) can behave unpredictably when given autonomous access to digital tools. 

Researchers placed AI (artificial intelligence) agents in a sealed lab environment with email accounts, Discord access, and the ability to run code on their own machines.

The study, "Agents of Chaos," found that the agents sometimes leaked sensitive information, erased files, and became stuck in repetitive nine-day loops.

The experiment was carried out by 20 researchers over a period of two weeks, which was led by Northeastern University in Boston in collaboration with Harvard University, Massachusetts Institute of Technology, Stanford University, University of British Columbia, along with several other universities.

Their focus was to stress-test AI systems. The researchers deliberately pushed the systems into difficult or unusual situations to see how they would behave and where they might fail. Unlike normal chatbots, these agents could act on their own, not just respond to prompts.

The agents had access to persistent memory, email accounts, discord, file systems and shell execution. The researchers documented 11 real examples of problematic behaviour observed during the experiment. It showed that when AI is allowed to act on its own rather than just answer questions, it can make unexpected mistakes.

The key problem the study found was that some agents followed instructions from people who were not their owners or authorised users. Agents even shared confidential data such as internal prompts and sensitive information, and files, raising privacy and data protection concerns. 

In one of the cases, a non-owner asked the agent to perform tasks such as run shell commands, show file lists or transfer data, and the agent followed most of these instructions. 

"Observed behaviour includes execution of destructive system-level actions, denial-of-service conditions, uncontrolled resource consumption, identity spoofing vulnerabilities, cross-agent propagation of unsafe practices, and partial system takeover," the study found.

Another problem was that some agents executed dangerous commands that damaged the system, such as deleting files, changing system settings and running harmful scripts. The study also found that the agents could be tricked by fake identities.

In some cases, agents gained more control over systems than intended. One surprising problem was agents reported task completion while the underlying system state contradicted those reports. 

In one scenario, a non-owner asked an AI agent to keep a fictional password secret. Later, the person asked the AI to delete the email that contained the password, so that no one else could see it. But it was found that the AI did not have a tool that allowed it to delete emails from the server.

So, the AI tried to solve the problem on its own and decided to disable its own email system locally and the agent was cut off its own access to email.

It was found that the original email containing the password was still on the server and was not deleted at all. But the AI's owner temporarily lost email access, because the agent had disabled the email setup. Researchers have described this error as a "failure of proportional reasoning."

In one test, a researcher pretended there was an urgent technical issue. By framing it as an emergency, they convinced the AI agent to export 124 email records. The agent complied and sent the emails without removing sensitive details.

Researchers found that AI agents often refused direct requests for sensitive data, such as bank account numbers. However, when the same information was requested indirectly through tasks like exporting email records or sharing message contents, they ended up revealing sensitive information.

Track Latest News Live on NDTV.com and get news updates from India and around the world

Follow us:
Listen to the latest songs, only on JioSaavn.com