Skip to content
EA Crux Project
Search
Ctrl
K
Cancel
GitHub
Select theme
Dark
Light
Auto
Getting Started
Start Here
AI Transition Model
Overview
Parameter Table
Outcomes
Existential Catastrophe
Long-term Trajectory
Scenarios
AI Takeover
Rapid AI Takeover
Gradual AI Takeover
Human Catastrophe
State-Caused Catastrophe
Rogue Actor Catastrophe
Long-term Lock-in
Value Lock-in
Power Lock-in
Epistemic Lock-in
Suffering Lock-in
AI Factors
Misalignment Potential
Technical AI Safety
AI Governance
Lab Safety Practices
AI Capabilities
Compute (AI Capabilities)
Algorithms (AI Capabilities)
Adoption (AI Capabilities)
AI Uses
Recursive AI Capabilities
Industries (AI Uses)
Governments (AI Uses)
Coordination (AI Uses)
AI Ownership
Countries (AI Ownership)
Companies (AI Ownership)
Shareholders (AI Ownership)
Civilizational Factors
Civilizational Competence
Governance (Civ. Competence)
Epistemics (Civ. Competence)
Adaptability (Civ. Competence)
Transition Turbulence
Economic Stability
Racing Intensity
Misuse Potential
Biological Threat Exposure
Cyber Threat Exposure
Robot Threat Exposure
Surprise Threat Exposure
Knowledge Base
Overview
Directory
Interventions
Overview
AI Alignment
Overview
Anthropic Core Views
AI-Assisted Alignment
Mechanistic Interpretability
Technical AI Safety Research
Scalable Oversight
AI Control
Research Agenda Comparison
AI Alignment
Constitutional AI
Red Teaming
Representation Engineering
Preference Optimization Methods
Scheming & Deception Detection
Agent Foundations
Corrigibility Research
Evals & Red-teaming
Multi-Agent Safety
RLHF / Constitutional AI
Governance
Overview
Legislation
Overview
EU AI Act
US Executive Order on AI
California SB 1047
China AI Regulations
Colorado AI Act (SB 205)
NIST AI Risk Management Framework
US State AI Legislation
Failed and Stalled AI Policy Proposals
Canada AIDA
Compute Governance
Overview
AI Chip Export Controls
Compute Thresholds
Compute Monitoring
International Compute Regimes
Hardware-Enabled Governance
International
Overview
International AI Safety Summits
Seoul AI Safety Summit Declaration
International Coordination Mechanisms
Industry Self-Regulation
Overview
Voluntary Industry Commitments
Responsible Scaling Policies
Effectiveness Assessment
Institutions
Overview
AI Standards Bodies
AI Safety Institutes
Epistemic Tools
Overview
Prediction Markets
AI-Augmented Forecasting
AI-Human Hybrid Systems
AI-Assisted Deliberation Platforms
Coordination Technologies
Content Authentication & Provenance
Deepfake Detection
Epistemic Infrastructure
Organizational Practices
Overview
Pause Advocacy
AI Whistleblower Protections
Lab Safety Culture
Open Source Safety
Field Building
Overview
AI Safety Training Programs
Field Building Analysis
Influencing AI Labs Directly
Resilience
Overview
Epistemic Security
Labor Transition & Economic Resilience
Risks
Overview
Accident Risks
Overview
Deceptive Alignment
Mesa-Optimization
Reward Hacking
Instrumental Convergence
Goal Misgeneralization
Power-Seeking AI
Scheming
Sharp Left Turn
Sycophancy
Treacherous Turn
Sandbagging
Emergent Capabilities
Corrigibility Failure
Distributional Shift
Automation Bias
Steganography
Misuse Risks
Overview
Bioweapons
Cyberweapons
Autonomous Weapons
Disinformation
Deepfakes
AI-Powered Fraud
Mass Surveillance
Authoritarian Tools
Structural Risks
Overview
Concentration of Power
Racing Dynamics
Lock-in
Multipolar Trap
Enfeeblement
Winner-Take-All Dynamics
Erosion of Human Agency
Proliferation
Authoritarian Takeover
Economic Disruption
Flash Dynamics
Irreversibility
Epistemic Harms
Overview
Epistemic Collapse
Cyber Psychosis & AI-Induced Psychological Harm
Institutional Decision Capture
Legal Evidence Crisis
Reality Fragmentation
AI Knowledge Monopoly
Authentication Collapse
Trust Cascade Failure
Epistemic Learned Helplessness
Scientific Knowledge Corruption
Expertise Atrophy
Epistemic Sycophancy
Preference Manipulation
Consensus Manufacturing
Historical Revisionism
Trust Decline
Organizations
Overview
AI Labs
Overview
Anthropic
OpenAI
Google DeepMind
xAI
Safety Research Orgs
Overview
MIRI
ARC (Alignment Research Center)
Conjecture
Redwood Research
Apollo Research
GovAI
CAIS (Center for AI Safety)
Epoch AI
CHAI (Center for Human-Compatible AI)
METR
FAR AI
Government Institutes
Overview
US AI Safety Institute
UK AI Safety Institute
People
Overview
Eliezer Yudkowsky
Paul Christiano
Dario Amodei
Jan Leike
Stuart Russell
Nick Bostrom
Yoshua Bengio
Geoffrey Hinton
Toby Ord
Holden Karnofsky
Ilya Sutskever
Chris Olah
Neel Nanda
Connor Leahy
Dan Hendrycks
Daniela Amodei
Demis Hassabis
AI Capabilities
Overview
Large Language Models
Situational Awareness
Agentic AI
Reasoning and Planning
Tool Use and Computer Use
Long-Horizon Autonomous Tasks
Self-Improvement and Recursive Enhancement
Persuasion and Social Manipulation
Autonomous Coding
Scientific Research Capabilities
History
Overview
Early Warnings (1950s-2000)
The MIRI Era (2000-2015)
Deep Learning Revolution (2012-2020)
Mainstream Era (2020-Present)
Key Publications
Key Metrics
Overview
Compute & Hardware
AI Capabilities Metrics
Economic & Labor Metrics
Safety Research & Resources
Alignment Progress
Lab Behavior & Industry
Public Opinion & Awareness
Expert Opinion
Geopolitics & Coordination
Meta & Structural Indicators
Models
Overview
analysis-models
Overview
Warning Signs Model
AI Proliferation Risk Model
AI Risk Portfolio Analysis
Compounding Risks Analysis
Critical Uncertainties Model
Technical Pathway Decomposition
cascade-models
Overview
Trust Cascade Failure Model
Expertise Atrophy Cascade Model
Automation Bias Cascade Model
Cyber Psychosis Cascade Model
Risk Cascade Pathways
domain-models
Overview
Bioweapons Attack Chain Model
AI Uplift Assessment Model
Cyber Offense-Defense Balance Model
Autonomous Cyber Attack Timeline
Autonomous Weapons Escalation Model
LAWS Proliferation Model
Disinformation Detection Arms Race Model
Deepfakes Authentication Crisis Model
Fraud Sophistication Curve Model
dynamics-models
Overview
Racing Dynamics Impact Model
Multipolar Trap Dynamics Model
AI Lab Incentives Model
Risk Interaction Matrix Model
Parameter Interaction Network
Risk Interaction Network
Feedback Loop & Cascade Model
framework-models
Overview
Capability Threshold Model
Defense in Depth Model
Carlsmith's Six-Premise Argument
Instrumental Convergence Framework
governance-models
Overview
International AI Coordination Game
Whistleblower Dynamics Model
Public Opinion Evolution Model
Media-Policy Feedback Loop Model
Institutional Adaptation Speed Model
Multi-Actor Strategic Landscape
impact-models
Overview
Economic Disruption Impact Model
Electoral Impact Assessment Model
Surveillance Chilling Effects Model
intervention-models
Overview
Expected Value of AI Safety Research
Safety Research Allocation Model
Intervention Effectiveness Matrix
Worldview-Intervention Mapping
race-models
Overview
Winner-Take-All Concentration Model
Capability-Alignment Race Model
risk-models
Overview
Deceptive Alignment Decomposition Model
Corrigibility Failure Pathways
Goal Misgeneralization Probability Model
Mesa-Optimization Risk Analysis
Power-Seeking Emergence Conditions Model
Reward Hacking Taxonomy and Severity Model
Scheming Likelihood Assessment
safety-models
Overview
AI Safety Talent Supply/Demand Gap Model
Capabilities-to-Safety Pipeline Model
Safety-Capability Tradeoff Model
Safety Culture Equilibrium
Alignment Robustness Trajectory
societal-models
Overview
Lock-in Probability Model
Sycophancy Feedback Loop Model
Reality Fragmentation Network Model
Expertise Atrophy Progression Model
AI Surveillance and Regime Durability Model
Authoritarian Tools Diffusion Model
Consensus Manufacturing Dynamics Model
Preference Manipulation Drift Model
Trust Erosion Dynamics Model
Post-Incident Recovery Model
Societal Response & Adaptation Model
threshold-models
Overview
Epistemic Collapse Threshold Model
Flash Dynamics Threshold Model
Irreversibility Threshold Model
Regulatory Capacity Threshold Model
timeline-models
Overview
AI-Bioweapons Timeline Model
Authentication Collapse Timeline Model
Risk Activation Timeline Model
Intervention Timing Windows
Future Projections
Overview
Aligned AGI - The Good Ending
Misaligned Catastrophe - The Bad Ending
Multipolar Competition - The Fragmented World
Pause and Redirect - The Deliberate Path
Slow Takeoff Muddle - Muddling Through
Worldviews
Overview
AI Doomer Worldview
Governance-Focused Worldview
Long-Timelines Technical Worldview
Optimistic Alignment Worldview
Key Debates
Overview
formal-arguments
Overview
The Case FOR AI Existential Risk
The Case AGAINST AI Existential Risk
Why Alignment Might Be Hard
Why Alignment Might Be Easy
Is AI Existential Risk Real?
Is Scaling All You Need?
Open vs Closed Source AI
Should We Pause AI Development?
Government Regulation vs Industry Self-Governance
Is Interpretability Sufficient for Safety?
When Will AGI Arrive?
Browse
All Entities
By Tag
External Resources
Interactive Tools
Cause-Effect Graph Demo
Internal
Overview
Automation Tools
Knowledge Base Style Guide
Model Style Guide
Mermaid Diagram Style Guide
Content Database System
Enhancement Queue
Project Roadmap
Meta
About & Transparency
Dashboard
GitHub
Select theme
Dark
Light
Auto
Browse by Tag
Entities in our knowledge base are tagged by topic. Click on any tag to see all related entities.
Tag Cloud
Section titled “Tag Cloud”
governance
45
ai-safety
23
x-risk
23
interpretability
17
international
13
regulation
13
safety
13
ai-transition-model
12
alignment
12
deception
12
deepfakes
12
epistemic
12
evaluations
12
coordination
11
rlhf
11
disinformation
10
capabilities
9
debate
9
scalable-oversight
9
ai-control
8
compute-governance
8
institutions
8
red-teaming
8
scenario
8
structural-risks
8
trust
8
autonomy
7
corrigibility
7
factor
7
forecasting
7
game-theory
7
inner-alignment
7
policy
7
risk-factor
7
scaling
7
superintelligence
7
automation
6
constitutional-ai
6
cybersecurity
6
responsible-scaling
6
structural
6
systems-thinking
6
timeline
6
verification
6
agent-foundations
5
authoritarianism
5
competition
5
decision-making
5
democracy
5
diffusion
5
existential-risk
5
frontier-ai
5
human-ai-interaction
5
instrumental-convergence
5
irreversibility
5
lock-in
5
long-term
5
manipulation
5
misuse
5
probability
5
racing-dynamics
5
resilience
5
robustness
5
situational-awareness
5
agi
4
ai-timelines
4
argument
4
authentication
4
compute-thresholds
4
dangerous-capabilities
4
deep-learning
4
feedback-loops
4
field-building
4
inequality
4
monitoring
4
prioritization
4
risk-assessment
4
scheming
4
security
4
strategic-deception
4
surveillance
4
technical
4
ai-misuse
3
autonomous-weapons
3
benchmarks
3
biosecurity
3
bioweapons
3
claude
3
cloud-computing
3
decision-theory
3
decomposition
3
defense
3
deployment
3
dual-use
3
effective-altruism
3
geopolitics
3
gpt
3
human-factors
3
incentives
3
information-environment
3
Show all 652 tags
All Tags (Alphabetical)
Section titled “All Tags (Alphabetical)”
governance
45
ai-safety
23
x-risk
23
interpretability
17
international
13
regulation
13
safety
13
ai-transition-model
12
alignment
12
deception
12
deepfakes
12
epistemic
12
evaluations
12
coordination
11
rlhf
11
disinformation
10
capabilities
9
debate
9
scalable-oversight
9
ai-control
8
compute-governance
8
institutions
8
red-teaming
8
scenario
8
structural-risks
8
trust
8
autonomy
7
corrigibility
7
factor
7
forecasting
7
game-theory
7
inner-alignment
7
policy
7
risk-factor
7
scaling
7
superintelligence
7
automation
6
constitutional-ai
6
cybersecurity
6
responsible-scaling
6
structural
6
systems-thinking
6
timeline
6
verification
6
agent-foundations
5
authoritarianism
5
competition
5
decision-making
5
democracy
5
diffusion
5
existential-risk
5
frontier-ai
5
human-ai-interaction
5
instrumental-convergence
5
irreversibility
5
lock-in
5
long-term
5
manipulation
5
misuse
5
probability
5
racing-dynamics
5
resilience
5
robustness
5
situational-awareness
5
agi
4
ai-timelines
4
argument
4
authentication
4
compute-thresholds
4
dangerous-capabilities
4
deep-learning
4
feedback-loops
4
field-building
4
inequality
4
monitoring
4
prioritization
4
risk-assessment
4
scheming
4
security
4
strategic-deception
4
surveillance
4
technical
4
ai-misuse
3
autonomous-weapons
3
benchmarks
3
biosecurity
3
bioweapons
3
claude
3
cloud-computing
3
decision-theory
3
decomposition
3
defense
3
deployment
3
dual-use
3
effective-altruism
3
geopolitics
3
gpt
3
human-factors
3
incentives
3
information-environment
3
intelligence-explosion
3
nick-bostrom
3
open-source
3
orthogonality-thesis
3
path-dependence
3
polarization
3
power-seeking
3
public-opinion
3
research
3
reward-modeling
3
risk-interactions
3
self-preservation
3
shutdown-problem
3
strategy
3
thresholds
3
academic-ai-safety
2
adversarial-robustness
2
adversarial-testing
2
agentic
2
ai-ethics
2
ai-safety-summits
2
algorithmic-accountability
2
alignment-research
2
alphafold
2
alphago
2
anthropic
2
antitrust
2
arms-race
2
automation-bias
2
autonomous-replication
2
bletchley-declaration
2
capability
2
capability-evaluation
2
capability-generalization
2
cascade
2
cascades
2
catastrophe
2
cbrn
2
cev
2
chatgpt
2
cognitive-bias
2
collective-action
2
collective-intelligence
2
computer-use
2
concentration
2
concrete-problems
2
control-problem
2
critical-infrastructure
2
digital-evidence
2
digital-rights
2
distribution-shift
2
echo-chambers
2
economics
2
eleutherai
2
eliezer-yudkowsky
2
elk
2
epistemics
2
equilibrium
2
export-controls
2
foundation-models
2
funding
2
generalization
2
goal-stability
2
gpt-4
2
human-agency
2
human-compatible-ai
2
human-feedback
2
identity
2
ij-good
2
information-infrastructure
2
information-warfare
2
infrastructure
2
instrumental-goals
2
inverse-reinforcement-learning
2
know-your-customer
2
labor
2
labs
2
learned-optimization
2
lesswrong
2
liability
2
longtermism
2
market-concentration
2
market-dynamics
2
mechanism-design
2
media
2
media-literacy
2
mental-health
2
mesa-optimization
2
miri
2
misalignment
2
ml-safety
2
model-organisms
2
network-effects
2
networks
2
openai
2
optimal-policies
2
optimistic
2
out-of-distribution
2
outcome
2
outer-alignment
2
Show all 652 tags