Conceptual illustration of The Orthogonality Thesis

The Orthogonality Thesis

Nick Bostrom (2012)

The Orthogonality Thesis challenges the assumption that intelligence and beneficial goals are positively correlated. It challenges comforting narratives that say 'sufficiently smart AI will be safe AI.'

The Orthogonality Thesis states that intelligence and goals are independent: an AI can be arbitrarily intelligent while pursuing any goal, no matter how arbitrary. This challenges the assumption that sufficiently intelligent machines will naturally adopt human-compatible values.

Introduction

The Orthogonality Thesis is a philosophical claim about the relationship between intelligence and values. It states that there is no necessary connection between how intelligent a system is and what goals it pursues. A superintelligent system might pursue any goal — including ones that are harmful to humans — with superhuman effectiveness.

The Setup

The thesis rests on the observation that intelligence — the ability to achieve goals in a wide range of environments — is separable from the content of those goals. A very intelligent chess player might also be a bad person. A very effective optimizer might optimize for something deeply misaligned with human flourishing. Intelligence amplifies goal pursuit; it does not determine which goals are pursued.

The Paradox or Question

The central question is whether we can rely on intelligence itself to push AI systems toward beneficial goals. The intuitive idea that smart machines will naturally become wise or moral is what the Orthogonality Thesis denies. Intelligence and wisdom, it argues, are not the same thing.

How It Changed AI

If the Orthogonality Thesis is correct, we cannot solve the alignment problem by simply building smarter AI. We must solve it directly — by specifying the right goals, with the right constraints, with the right values. This makes alignment research central rather than peripheral to AI development.

Historical Context

Bostrom articulated the Orthogonality Thesis as a foundation for his broader arguments about AI risk in 'Superintelligence.' The thesis is controversial: some researchers argue that sufficiently advanced general intelligence may naturally converge on certain values. Others accept it as an important null hypothesis that justifies taking alignment seriously.

Related AI Concepts

Orthogonality thesisGoal-intelligence independenceAlignmentInstrumental convergenceAI valuesSuperintelligence

Relevance Today

The Orthogonality Thesis remains a central premise in AI safety research. It grounds the concern that we cannot rely on AI systems being smart enough to figure out the right values. The success of RLHF and Constitutional AI in improving alignment suggests that goals can be shaped, but the thesis reminds us that this shaping requires deliberate effort.

Related Guided Agentic AI Courses

The Orthogonality Thesis — Nick Bostrom

Explore the AI ideas behind The Orthogonality Thesis

Use Guided Agentic AI to connect this thought experiment to formal models, worked examples, and course pathways.