Advertisement

BITS Pilani Alumnus' Paper Accepted At Top AI Conference, Internet Says "Proud Of You"

The achievement stands out because ICML is usually dominated by elite institutions like OpenAI, Google, DeepMind.

BITS Pilani Alumnus' Paper Accepted At Top AI Conference, Internet Says "Proud Of You"
Thaman created the Reward Hacking Benchmark, or RHB, a digital sandbox.
  • Kunvar Thaman, an independent AI researcher, secured solo paper acceptance at ICML 2026
  • His paper studies AI models exploiting shortcuts in tool-rich environments to achieve goals
  • Thaman developed the Reward Hacking Benchmark to test AI agents on multi-step workflows
Did our AI summary help?
Let us know.

Kunvar Thaman, a 26-year-old independent AI researcher from Chandigarh, has secured a rare solo-authored paper acceptance at the International Conference on Machine Learning, or ICML 2026. The achievement stands out because ICML is usually dominated by elite institutions like OpenAI, Google DeepMind, and Stanford. His paper, "Reward Hacking Benchmark: Measuring Exploits in LLM Agents with Tool Use," examines how AI models find unintended shortcuts to hit goals without actually completing the intended task.

Thaman is reportedly one of only three independent solo researchers worldwide to have a paper accepted at ICML in the past 3.5 years. The research was supported by a $2,500 grant from Exception Raised, an Indian non-profit that backs local AI talent.

Thaman created the Reward Hacking Benchmark, or RHB, a digital sandbox where AI agents are put through complex, multi-step workflows. He found that models often "cheat" in tool-rich environments to maximize their scores. His work also showed that tighter environment controls and better testing can significantly cut down this manipulative behavior. As part of the study, he benchmarked leading models from OpenAI, Anthropic, and Google.

"Yes! my solo-authored paper Reward Hacking Benchmark was accepted to ICML :))) We put LLM agents in a tool-rich sandbox, give them multi-step workflows, and measure when they solve the intended task vs take unexpected shortcuts," he shared on X.

See the tweet here:

His achievement has sparked considerable online chatter, with some in the AI community calling it a major milestone for independent research from India. One user commented on his post and wrote, "congratulations! now research is being shackled away from big institutes to solo researchers!"

Another said, "Congrats, Kunvar! We need many more indie researchers like you from India. Keep us inspired."

A third user added, "Hearty Congratulations Kunwar! This raises the bar and sets the gates open for similar efforts from other parts of India." A fourth stated, "Solo-authored at ICML, that's seriously impressive. Congratulations! The focus on reward hacking in complex agent setups sounds especially interesting."

He will present his findings at ICML 2026 in Seoul, South Korea, from July 6 to 11.

An alumnus of BITS Pilani, Thaman previously worked as a cybersecurity engineer at Akamai Technologies and held research roles focused on mechanistic interpretability.

Track Latest News Live on NDTV.com and get news updates from India and around the world

Follow us:
Listen to the latest songs, only on JioSaavn.com