- Mrinank Sharma resigned as head of Anthropics safeguards research team citing value conflicts
- His resignation followed Anthropic's release of Claude Opus 4.6, an upgraded AI model
- Sharma warned of rising global crises and challenges in aligning AI with core values
Mrinank Sharma, the head of Anthropic's safeguards research team, announced his resignation in an X (formerly Twitter) post on Monday (Feb 9), which quickly garnered the attention of social media users. Sharma's rather cryptic post, where he references poets like Rilke and William Stafford, was quickly deconstructed by tech experts and commentators, who pointed out that compromise on AI safety may have prompted him to resign.
In his letter, Sharma said it was clear to him that it was time to move on, stating that the world is in peril, not just from AI, but a “whole series of interconnected crises unfolding in this very moment".
"We appear to be approaching a threshold where our wisdom must grow in equal measure to our capacity to affect the world, lest we face the consequences," wrote Sharma.
Though he did not offer any specifics, Sharma stated that constant pressure meant setting aside what mattered the most to him: the values.
“I've repeatedly seen how hard it is to truly let our values govern our actions. I've seen this within myself, within the organisation, where we constantly face pressures to set aside what matters most.”
Sharma also pointed out that one of his final projects was about understanding "how AI assistants make us less human or distort our humanity", with online users speculating that Anthropic's recent push to ship the product fast may have led to a compromise on the model's safety.
His decision came days after Anthropic rolled out Claude Opus 4.6, an upgraded model designed to boost office productivity and coding performance. The AI company has also been in talks to raise a new round of funding that could value Anthropic at $350 billion.
Also Read | AI Agents To Invade Social Media, Spread Narratives And Destroy Democracy, Study Warns
'Safety People Don't Fight Forever'
Reacting to the resignation, one user wrote: "The people building the guardrails and the people building the revenue targets occupy the same org chart, but they optimise for different variables. When the pressure to scale wins enough internal battles, the safety people don't fight forever. They leave and write beautifully worded letters about integrity."
Another added: "This has been the thing, most of the founding techies try to ignore flags, focusing more on marketing /sales /revenue. It could be due to the pressure of investors."
Sharma is not the first notable name in the AI safety domain team to depart Anthropic. Harsh Mehta, who worked in research and development, and leading AI scientist Behnam Neyshabur, said in X posts last week that they'd left Anthropic to "start something new".














