On AI
My views on AI, alignement capability gap narrowing
- posted: 2023-03-24
- updated: 2025-05-09
- status: in progress
- confidence: average
Let me start with a confession: I've been wrong about AI progress. Not catastrophically wrong, but wrong in ways that matter. Five years ago, I believed we had decades before confronting the thorniest questions about artificial intelligence. Now I'm watching those questions materialize in real-time, as language models become increasingly capable reality-generators that millions interact with daily.
Many debates around AI fall into predictable camps: techno-optimists promising utopia versus doomsayers forecasting extinction. Both positions feel like emotional commitments more than rational conclusions. I want to sketch a different perspective – one that acknowledges both the transformative potential of AI and the legitimate concerns about its development trajectory.
First, let's dispense with some common fallacies.
The anthropomorphism fallacy leads us to attribute human-like qualities to AI systems prematurely. When ChatGPT produces a compelling narrative about consciousness, we instinctively perceive a mind behind those words. But what we're witnessing is statistical pattern-matching operating at immense scale1. This doesn't mean consciousness or understanding will never emerge – but we should be careful about projecting our expectations onto systems that function fundamentally differently from human brains.
Conversely, the mere-pattern-matching fallacy underestimates what statistical learning can achieve. "It's just matching patterns" has become the rallying cry of AI skeptics. But this fundamentally misunderstands how capabilities emerge from large-scale learning. Most human cognition involves pattern recognition too – we've simply elevated certain patterns (those we recognize in ourselves) to the status of "real intelligence" while dismissing others2.
What concerns me isn't whether AI will become "conscious" in some philosophical sense, but rather what happens when systems become capable enough to shape the world without necessarily sharing human values or understanding consequences in human terms.
The alignment problem – ensuring AI systems robustly pursue goals aligned with human flourishing – becomes increasingly urgent as capabilities advance. We've already seen troubling examples of hallucination, bias, and manipulation in existing systems. These aren't just technical glitches but manifestations of fundamental challenges in creating systems that reliably understand and respect human values.
Perhaps most concerning is how AI development has accelerated while governance frameworks remain embryonic. We're running a global experiment without adequate safeguards or even agreement on what success looks like. The economic incentives driving AI development don't naturally align with safety considerations3. Profit-seeking behavior rewards rapid deployment over cautious development – a dynamic that makes thoughtful governance both essential and difficult to implement.
That said, there are reasons for measured optimism.
Leading AI labs have increasingly prioritized safety research, with significant resources dedicated to alignment work. Techniques like constitutional AI and RLHF represent meaningful progress in steering systems toward human values. The field is gradually developing better evaluation methods and safety benchmarks.
We're also seeing unprecedented collaboration between industry, academia, and government on AI governance. The pace of policy development, while still lagging technical progress, has accelerated dramatically. International coordination, though fragmented, is evolving faster than for previous technological revolutions.
Most importantly, there's growing recognition that AI safety isn't separate from AI capability – it's an integral component. Systems that misunderstand human intent, generate harmful content, or behave unpredictably aren't just unsafe; they're fundamentally inadequate at their intended functions.
Where does this leave us? I believe we're entering a critical decade where key technical and governance decisions will shape AI's long-term trajectory. We need several developments in parallel:
- Technical research that makes AI systems more transparent, interpretable, and aligned with human values
- Robust evaluation frameworks that stress-test systems before deployment
- Governance structures that balance innovation with appropriate safeguards
- Broader societal conversation about what we want from AI development
The path forward involves neither unbridled technological acceleration nor innovation-stifling regulation. It requires thoughtful navigation between these extremes – promoting beneficial applications while establishing guardrails against worst outcomes.
This isn't merely a technical challenge but a civilizational one. How we develop AI will reveal much about our collective wisdom, foresight, and capacity for cooperation. The stakes justify serious investment in getting this right, even if existential risk scenarios remain speculative.
I remain cautiously optimistic that we can develop AI in ways that augment human capability, expand opportunity, and address global challenges. But this outcome isn't inevitable – it requires deliberate effort to ensure technical progress advances human flourishing rather than undermining it.
As individuals, we can contribute by thinking clearly about these issues, supporting responsible development practices, and engaging in the broader conversation about AI governance. Our collective choices in this critical period will reverberate for generations.
-
This distinction becomes clearer when examining how large language models actually function. They predict the next token based on statistical patterns in their training data, without any internal model of "meaning" as humans understand it. See Bender & Koller's "Climbing towards NLU" for a detailed examination. ↩
-
Hofstadter explores this cognitive bias extensively in his work on intelligence and pattern recognition. We tend to dignify our own pattern-matching with terms like "intuition" or "understanding" while dismissing similar capabilities in machines. ↩
-
The economic dynamics driving AI development create what game theorists call a "race to the bottom" – where competitive pressure incentivizes cutting corners on safety. See Askell et al. for analysis of coordination problems in AI development. ↩