Father of AlphaGo Believes AI Has Taken the Wrong Path
Introduction: A Counter-Current Manifesto from an AI Legend
David Silver — a name that may not appear in tech headlines as frequently as Sam Altman or Elon Musk — carries enormous weight in the field of artificial intelligence research. As the head of DeepMind's reinforcement learning team and the driving force behind the AlphaGo project, Silver led the team that created the historic moment when AI defeated the human world champion in Go. Now, this AI pioneer has publicly stated that the current AI development trajectory centered on large language models (LLMs) may have taken a wrong turn.
What makes this even more striking is that he is not merely offering criticism — Silver has founded a new company valued at billions of dollars, with the goal of building an entirely new paradigm of AI 'Superlearners.'
Core Argument: LLMs Are Not the Right Path to General Intelligence
Silver's central thesis strikes at the heart of the biggest consensus in the current AI industry. In his view, while large language models represented by the GPT series deliver impressive performance in text generation, conversational interaction, and similar tasks, they are fundamentally products of 'imitation learning' — trained on massive volumes of human text data to predict the next token. This approach has an inherent ceiling: the model's capabilities are bounded by existing human knowledge.
Looking back at AlphaGo's journey to success, Silver has ample reason to hold this position. The original version of AlphaGo did learn from human game records, but the true quantum leap came with its successor, AlphaGo Zero — a system that completely discarded human game data and explored the mysteries of Go entirely from scratch through self-play and reinforcement learning, ultimately reaching a level far beyond human capability.
Silver believes this experience reveals a profound truth: truly powerful intelligence should not merely replicate what humans already know, but should possess the ability to autonomously explore and discover new knowledge. The current LLM trajectory is essentially using ever-larger models and ever-more data to approach the boundaries of human knowledge, but can never break through those boundaries.
The New Company's Vision: Building AI 'Superlearners'
Silver's newly founded company places 'Superlearners' at its conceptual core. While the company has not yet disclosed all technical details, the basic approach can be outlined from available information:
First, reinforcement learning takes priority. Unlike LLMs that rely on static dataset training, Superlearners will use reinforcement learning as their core driving force, accumulating experience and improving capabilities through continuous interaction with environments. This learning approach more closely resembles how humans and animals learn in the real world.
Second, autonomous discovery capability. Superlearners must not only solve known problems but also possess the ability to autonomously explore and discover patterns in unknown domains. Just as AlphaGo Zero discovered Go strategies that humans had never conceived of over thousands of years, Silver hopes the next generation of AI can produce breakthroughs beyond human cognition in fields such as scientific research and drug discovery.
Third, cross-domain generality. Unlike AlphaGo, which could only play Go, the goal for Superlearners is to become truly general learning systems capable of demonstrating powerful adaptive learning abilities across multiple tasks and domains.
The company has already achieved a valuation in the billions of dollars — a top-tier level among AI startups — reflecting strong capital market confidence in Silver's team and this technological direction.
Industry Analysis: A Deep Debate Over AI's Direction
Silver's perspective is not an isolated voice. In recent years, a growing number of leading researchers have begun questioning the 'Scaling is all you need' LLM trajectory. Turing Award laureate Yann LeCun has repeatedly and publicly criticized current LLMs for lacking genuine world models and reasoning capabilities; Ilya Sutskever, after leaving OpenAI, also hinted that the approach of simply scaling up pre-training may be approaching a bottleneck.
However, opposing voices are equally strong. Practices at OpenAI, Anthropic, Google, and other companies demonstrate that LLM capabilities continue to improve through expanding model scale and data volume. Progress in reasoning, programming, and multimodal understanding from models such as GPT-4, Claude, and Gemini is plain to see. Supporters argue that LLMs are not simply 'imitating' — they have already exhibited emergent forms of reasoning and generalization during training.
From a broader perspective, the essence of this debate lies in: what is the optimal path to artificial general intelligence (AGI)? Should we continue scaling along the LLM direction, hoping for more emergent capabilities? Or do we need to introduce entirely new architectural paradigms such as reinforcement learning and world models? Or perhaps fuse both approaches into a hybrid path?
Notably, these paths are not necessarily mutually exclusive. In fact, the most cutting-edge AI systems are already beginning to integrate multiple technologies. OpenAI's o1 and o3 series models have incorporated reinforcement learning to enhance reasoning capabilities, which to some extent validates Silver's judgment about the importance of reinforcement learning.
Outlook: A Diversified Future for AI Development
Silver's new journey represents an important signal in the AI field: the industry is moving from a singular LLM narrative toward more diversified technological exploration. This is a positive development for the entire AI ecosystem — parallel competition among multiple technological paths often catalyzes the most groundbreaking innovations.
For China's AI industry, Silver's approach also carries significant reference value. While numerous domestic companies remain locked in fierce competition over model parameter counts, perhaps more attention should be paid to frontier advances in reinforcement learning, autonomous exploration, and related directions. On the road to AGI, whoever finds the correct paradigm first will gain a true strategic advantage.
Years ago, AlphaGo's masterful moves stunned the world. Now Silver sets out once again, seeking to prove that the future of AI lies not in bigger models, but in smarter ways of learning. This deep transformation in AI's direction may have only just begun.
📌 Source: GogoAI News (www.gogoai.xin)
🔗 Original: https://www.gogoai.xin/article/alphago-creator-david-silver-believes-ai-took-wrong-path
⚠️ Please credit GogoAI when republishing.