Demis Hassabis on the Missing Pieces for AGI, the Limits of Agents, and the Next Scientific Breakthrough

Demis Hassabis, the co-founder and CEO of DeepMind, has spent over a decade pushing the boundaries of artificial intelligence. As lead of the team behind AlphaGo, AlphaFold, and most recently the Gemini line of models, he is one of the few people whose opinions on AGI carry genuine weight. In his recent interviews and public talks, Hassabis has offered a clear-eyed diagnosis of what’s still missing for AGI, why today’s chatbot agents fall short, and what the next monumental discovery from AI might look like.

AGI’s missing ingredients: planning, memory, and world models

Hassabis often places AGI somewhere between two domains: the language-centered capabilities of large language models (LLMs) and the structured reasoning that humans and animals perform effortlessly. The core gap, he argues, is causal reasoning with a working world model. LLMs can mimic understanding but cannot plan over long horizons, simulate the consequences of actions in a flexible internal model, or correct beliefs when confronted with new evidence.

For instance, in a 2023 talk at the Royal Society, Hassabis noted that even state-of-the-art systems like GPT-4 or Gemini struggle with the “taxi problem”—a simple hypothetical where an AI must decide whether to treat a stopped car on the road as a static obstacle or predict it might pull out. Humans solve this instantly by modeling the driver’s intent. Current AIs cannot. “System 2 thinking—deliberate, step-by-step reasoning—is what we still lack,” he said. “AGI will require the system to pause, reflect, and use an internal simulation of the world to plan.”

This viewpoint echoes a broader critique from neuroscientists: present-day deep learning lacks the kind of hippocampal replay and mental time travel that biological brains use to learn efficiently. Without these, an AGI would be brittle—good at pattern matching, terrible at adapting to truly novel contexts.

Are agents the bridge? Not yet

The hype around “AI agents”—autonomous systems that can browse, code, book flights, or manage emails—suggests we are close to replacing human assistants. Hassabis is skeptical. In a recent interview with TIME, he pointed out that current agents fail on tasks requiring multi-step trust and reliability.

An agent that can find a cheap flight and book it on your calendar is impressive, but the same agent might also delete your flight reminder because it misinterprets an email. “The problem is not just capability—it’s robustness, safety, and understanding when to ask for help,” Hassabis explained. “An agent with poor world knowledge can cause harm even if it executes perfectly.”

DeepMind’s own experiments with agents—such as the Google Assistant-based “Project Mariner”—reveal that even with powerful underlying models (Gemini), agents still struggle with ambiguity, memory limits, and context switching. In controlled tests, the Mariner agent succeeded in 85% of straightforward tasks but dropped to 45% when it encountered unexpected UI changes or contradictory instructions.

Instead of trying to sell us a general agent, Hassabis advocates for narrow, domain-specific agents that are carefully bounded. For instance, a scientific agent that helps design experiments in materials science, not one that manages your whole digital life. “We should not confuse impressive demos with reliable products,” he warned. “The real test is whether you’d trust it to handle your bank account.”

AI’s next scientific breakthrough: from biology to physics and beyond

If AGI remains a distant goal, where will AI have its next “AlphaFold moment”? Hassabis expects breakthroughs in three connected areas: drug design, materials discovery, and scientific hypothesis generation.

AlphaFold itself—which predicted the 3D structures of 200 million proteins—transformed biology. Today, its successor, AlphaFold3, models protein–drug interactions at atomic resolution, a task that would have taken decades using traditional molecular simulations. But the deeper implication is that AI can now generate hypotheses instead of just verifying them. Hassabis envisions “a lab in the cloud” where an AI system can design a molecule with desired properties, simulate its interaction with a target, and propose synthetic routes—all before a human scientist touches a pipette.

Beyond biology, Hassabis points to controlled fusion as a candidate. In 2022, DeepMind collaborated with the Swiss Plasma Center to train a reinforcement learning agent that could control the magnetic coils inside a tokamak to maintain stable plasma. The results, published in Nature, showed the agent could discover novel morphing strategies that no human physicist had used. “The plasma control problem is exactly the kind of high-dimensional, real-time decision task where AI can outperform handcrafted controllers,” Hassabis noted.

Another frontier is mathematics. AlphaZero, the reinforcement learning agent that mastered chess, Go, and shogi without human knowledge, has since been repurposed to search for new proofs in number theory and combinatorics. In 2024, DeepMind researchers reported that AlphaZero-inspired agents found two new lower bounds for the Ramsey number R(3,9), a problem that had stumped mathematicians for 30 years.

The counterperspective: is world knowledge enough?

Not everyone agrees with Hassabis’s emphasis on world models and planning. Yann LeCun, Meta’s chief AI scientist, has argued that a simpler objective prediction architecture—trained on sensory video and text—can eventually lead to AGI without explicit causal modeling. LeCun’s “Joint Embedding Predictive Architecture” (JEPA) aims to learn abstract representations by predicting missing parts of an image or a sentence, which he claims automatically yields a world model.

Hassabis respects the idea but pushes back: “Prediction alone is necessary but not sufficient,” he said in a 2024 debate. “You need the ability to intervene in the world, to try a counterfactual, and to observe the outcome. That’s how children learn causality, and that’s what AGIs must learn.”

This disagreement is not merely academic. It dictates resource allocation: DeepMind invests heavily in reinforcement learning and simulation environments (like GreenFluid for physics), while Meta focuses on scaling self-supervised learning from internet data. Which path wins will determine whether AGI arrives in 10 years or 30.

What you can take away today

If you work with AI, Hassabis’s views offer a practical filter: when evaluating an agent, ask not what it can do in a demo, but what happens when things go wrong. When thinking about AGI, remember that language fluency is not intelligence. And as for science, expect AI to automate not just the brute-force search but the creative hypothesis generation—though always with a human in the loop.

The hardest part of building AGI might be admitting how much we still don’t understand about our own minds.