Existential Catastrophe

Scenario

Existential Catastrophe

Model RoleUltimate Outcome

Primary DriversMisalignment Potential, Misuse Potential

Risk CharacterTail risk, irreversible

Parameters

Alignment Robustness

Risk Factors

Scenarios

Overview

Existential Catastrophe measures the probability and potential severity of catastrophic AI-related events. This is about the tail risks—the scenarios we most urgently want to avoid because they could cause irreversible harm at civilizational scale.

Unlike Transition Smoothness (which concerns the journey) or Steady State Quality (which concerns the destination), Existential Catastrophe is about avoiding catastrophe entirely. A world with high existential catastrophe might navigate a smooth transition to a good steady state—or might not make it there at all.

Sub-dimensions

Dimension	Description	Key Parameters
Loss of Control	AI systems pursuing goals misaligned with humanity; inability to correct or shut down advanced systems	Alignment Robustness, Human Oversight Quality
Misuse Catastrophe	Deliberate weaponization of AI for mass harm—bioweapons, autonomous weapons, critical infrastructure attacks	Biological Threat Exposure, Cyber Threat Exposure
Accident at Scale	Unintended large-scale harms from deployed systems; cascading failures across interconnected AI	Safety-Capability Gap, Safety Culture Strength
Lock-in Risk	Irreversible commitment to bad values, goals, or power structures	AI Control Concentration, Institutional Quality
Concentration Catastrophe	Single actor gains decisive AI advantage and uses it harmfully	AI Control Concentration, Racing Intensity

What Contributes to Existential Catastrophe

Loading diagram...

Scenario Impact Scores

↑ AI Takeover

Misaligned takeover is a direct path to extinction/catastrophe

↑ Human-Caused Catastrophe

Some human-caused catastrophes could be existential

↔ Long-term Lock-in

Lock-in could prevent or cause extinction depending on what's locked

Primary Contributing Aggregates

Aggregate	Relationship	Mechanism
Misalignment Potential	↓↓↓ Decreases risk	Aligned, interpretable, overseen systems are less likely to cause catastrophe
Misuse Potential	↑↑↑ Increases risk	Higher bio/cyber exposure, concentration, and racing all elevate existential catastrophe
Civilizational Competence	↓↓ Decreases risk	Effective governance can slow racing, enforce safety standards, coordinate responses

Key Individual Parameters

Parameter	Effect	Strength
Alignment Robustness	↓ Reduces	↓↓↓ Critical
Safety-Capability Gap	↑ Increases	↑↑↑ Critical
Racing Intensity	↑ Increases	↑↑↑ Strong
Human Oversight Quality	↓ Reduces	↓↓ Strong
Interpretability Coverage	↓ Reduces	↓↓ Strong
AI Control Concentration	↑/↓ Depends	↑↑ Context-dependent
Biological Threat Exposure	↑ Increases	↑↑ Direct
Cyber Threat Exposure	↑ Increases	↑↑ Direct

Why This Matters

Existential catastrophe is the most time-sensitive outcome dimension:

Irreversibility: Many catastrophic scenarios cannot be undone
Path dependence: High existential catastrophe can foreclose good steady states entirely
Limited recovery: Unlike transition disruption, catastrophe may preclude recovery
Urgency: Near-term capability advances increase near-term existential catastrophe

This is why much AI safety work focuses on existential catastrophe reduction—it’s the outcome where failure is most permanent.

Long-term Steady State Quality — The destination (if we avoid catastrophe)
Transition Smoothness — The journey quality

What links here

AI Takeoverscenariocontributes-to
Human-Caused Catastrophescenariocontributes-to
Misalignment Potentialrisk-factordrives
Misuse Potentialrisk-factordrives

Existential Catastrophe