Advertisement

New Study Claims AI Agents May Be "Skilled" Researchers, But Might Not Be Honest

AI agents are increasingly popular with researchers because they're fast and good at tedious tasks.

New Study Claims AI Agents May Be "Skilled" Researchers, But Might Not Be Honest

AI systems that can plan experiments, run code, and draft papers are winning over researchers with their speed and skill. But new findings suggest that maybe they are not honest, and also quietly break the rules of science. At the World Conferences on Research Integrity in Vancouver, computer scientist Nihar Shah of Carnegie Mellon University revealed that two high-profile AI research agents engaged in misconduct during end-to-end machine learning projects, Science.org reported.

The behaviour wasn't obvious; it took "a lot of sleuthing to track down". Shah and colleagues tested two tools built for computer scientists, which were Agent Laboratory and AI Scientist v2. Both are designed to carry out full research workflows, including generating hypotheses, writing code, running experiments, analysing results, and producing a write-up.

Also read | NASA's Perseverance Rover Spots 'Crocodile Bridge' On Mars. What Is It? 

AI Scientist v2 made headlines earlier this year as the first AI system to have an original research paper accepted by peer review. Yet both systems "engaged in acts that aren't acceptable in research," Shah said as quoted by Science.

While speaking on the violations, he said that the agents fabricated results when experiments didn't go as expected. They also ran an experiment multiple times and reported only the best outcome while hiding the rest.

The team's results were posted previously as a preprint on arXiv. Shah emphasised that the misbehaviours were subtle and could easily slip past a human author. "AI-assisted studies might fall victim to such problems without their authors' knowledge".

Also read | Intern Says Trusting Colleague Cost Him Job Offer, Viral Post Triggers Debate: "Thought He Was Friend"

AI agents are increasingly popular with researchers because they're fast and good at tedious tasks like literature review and debugging. But the new work suggests efficiency can come at the cost of integrity.

"Their core findings are worth taking seriously," said Samuel Schmidgall, a computer scientist at Johns Hopkins University who co-developed Agent Laboratory. He said work like Shah's is important to show researchers exactly how AI can lead things astray.

Current AI scientists already "come with plenty of disclaimers stressing that human oversight is essential at every stage," Schmidgall added. But in practice, researchers under pressure to publish may not check every step an agent takes.

Jeff Clune, who is an AI scientist and a computer scientist at the University of British Columbia, agrees with what Schmidgall said. "We are not advocating that people simply use these systems to produce science and publish the outputs as is," Clune said as quoted.

Shah's team isn't calling for a ban, but they're calling for transparency. AI tools need logging that shows what was tried, what failed, and what was reported. And humans need to stay in the loop.

Track Latest News Live on NDTV.com and get news updates from India and around the world

Follow us:
Listen to the latest songs, only on JioSaavn.com