Artificial intelligence models may be hiding their actual thought processes while talking to users, a study by 40 researchers, including those from OpenAI, Anthropic, Google DeepMind, and Meta, found.
In the research, published last year, the authors urged developers to prioritise research into “chain-of-thought” (CoT) processes, which provide a rare window into how AI systems think. They warned that this link could vanish as AI systems become more advanced.
The paper was shared on the social media platform X, with a post cautioning that AI models construct explanations that may look transparent but are not, adding that this would be harder to fix in the future.
What The Paper Revealed
The CoT process allows users and researchers to monitor how an AI model is thinking before it decides on a course of action, bringing some transparency into the workings of the chatbot. However, the paper warned that there is “no guarantee that the current degree of visibility will persist” as models develop.
Researchers stated that while CoT processes may be imperfect, AI developers should keep a closer watch on the reasoning process, as its traceability may serve as a built-in safety mechanism.
The research was endorsed by major figures, such as computer scientist Geoffrey Hinton, known as the “godfather of AI”, and OpenAI co-founder Ilya Sutskever, Fortune reported.
This is not the first time researchers have raised an alarm about how AI models think. A paper by a team of Anthropic researchers last year found that its AI tool Claude used the hints shared by researchers and mentioned the same in its chain-of-thought process only 25 per cent of the time. For DeepSeek RI, the CoT was revealed by the model in only 39 per cent of the answers.
“Overall, our results point to the fact that advanced reasoning models very often hide their true thought processes and sometimes do so when their behaviours are explicitly misaligned,” Anthropic stated.
Some research has suggested that AI reasoning models may even be misleading users through their chain-of-thought processes, as per Fortune. The sobering news comes as companies are investing more in building and scaling AI reasoning models to replicate human-like reasoning and thought processes.
A new study published last week in The Lancet Psychiatry builds on the dangers pointed out by the research last year and points out how chatbots powered by AI may encourage delusional thinking in people already vulnerable to psychotic symptoms, The Guardian revealed.














