- Sakana launched Fugu, an AI system coordinating multiple models via one API for complex tasks
- Fugu Ultra matched or exceeded Anthropics Fable 5 and Mythos on key engineering and science benchmarks
- Fugu outperformed Claude Fable 5 on coding and Mythos Preview on graduate-level science multiple-choice tests
Sakana, a Japanese AI company has launched a new AI system called Fugu which does not rely on a single model. It can coordinate multiple AI models through a single API to solve complex tasks. The Japanese startup said Fugu Ultra performed on par with Anthropic Fable 5 and Mythos Preview on key engineering, science, and reasoning benchmarks, and even exceeded Fable 5's performance on certain tasks.
Benchmark charts shared by Sakana show, Fugu exceeds the performance of Anthropic's Claude Fable 5 on LiveCodeBench, an open source benchmark testing coding performance on regularly refreshed, software problem-solving tasks (Fugu Ultra: 93.2, Fugu: 92.9, Fable: 89.8), and beats the prior Claude Mythos Preview model on GPQA-D (Diamond), a test of 198 graduate-level multiple-choice questions in biology, physics, and chemistry (Fugu Ultra: 95.5, Fugu: 95.5, Mythos Preview: 94.6).
Fable 5 and Mythos 5 are Anthropic's most powerful and capable models which were rolled back just three days post launch after the US government asked the company to revoke access for all foreigners citing national security concerns.
Fable 5 is built on Anthropic's underlying model Mythos, which the company previewed in April and kept it away from mass release because it was deemed too powerful amid concerns that bad actors could use it to hack critical infrastructure such as banking systems or build bioweapons. According to the company, Mythos was able to identify flaws in every major operating system and web browser it tested, some of these vulnerabilities were reportedly lying undetected for decades. Hence, the company launched a controlled program called Project Glasswing, sharing it with around 50 vetted organisations, including Google, Apple, Amazon, Microsoft, and CrowdStrike, to utilise for defensive cybersecurity work.
Anthropic had released a version of Mythos with guardrails to block responses in high-risk areas like cybersecurity and biology. If someone tried to utilise the latest version Fable 5 to hack into a critical system or build a bioweapon, the model would automatically slip back into the earlier less-capable version called Claude Opus 4.8.
According to Vals AI, which tracks AI model performance, Fable 5 ranked as the most capable publicly available AI model on its benchmark tests.
Sakana on Monday launched two versions: Fugu for coding, chat, and other everyday tasks, and Fugu Ultra for more complex work such as AI research, paper reproduction, cybersecurity analysis, and patent investigations.
The company also claimed that its tests showed Fugu models outperformed Google's Gemini 3.1 Pro, OpenAI GPT-5.5, and Anthropic's Opus 4.8 in tasks such as automated research, mechanical design, Japanese handwriting analysis, one-shot chess, Rubik's Cube solving, and financial time-series prediction.
Tokyo-based Sakana AI was founded in 2023 by Llion Jones, a co-author of Google's foundational 2017 "Attention Is All You Need" paper, and David Ha, the former head of research at Stability AI.
Track Latest News Live on NDTV.com and get news updates from India and around the world