AI Risks
Overview
Section titled “Overview”This section documents the potential risks from advanced AI systems, organized into four major categories based on the source and nature of the risk.
Risk Categories
Section titled “Risk Categories”Unintended failures from AI systems pursuing misaligned goals:
- Scheming - AI strategically concealing misaligned goals
- Deceptive Alignment - Models appearing aligned during training
- Mesa-Optimization - Learned optimizers with misaligned objectives
- Goal Misgeneralization - Objectives that fail in deployment
- Power-Seeking - Instrumental convergenceRiskInstrumental ConvergenceComprehensive review of instrumental convergence theory with extensive empirical evidence from 2024-2025 showing 78% alignment faking rates, 79-97% shutdown resistance in frontier models, and exper...Quality: 64/100 toward acquiring resources
Deliberate harmful applications of AI capabilities:
- Bioweapons - AI-assisted biological weapon development
- Cyberweapons - Automated cyber attacks and vulnerabilities
- Disinformation - Large-scale manipulation campaigns
- Autonomous Weapons - Lethal autonomous systems
Systemic issues from how AI development is organized:
- Racing Dynamics - Competitive pressure reducing safety investment
- Concentration of Power - Dangerous accumulation of AI capabilities
- Lock-in - Irreversible entrenchment of values or structures
- Economic Disruption - Labor market and economic instability
Threats to society’s ability to know and reason:
- Trust Decline - Erosion of institutional and interpersonal trust
- Authentication Collapse - Inability to verify authentic content
- Expertise Atrophy - Loss of human capability through AI dependence
How Risks Connect
Section titled “How Risks Connect”Many risks interact and compound. For example:
- Racing dynamicsRiskRacing DynamicsRacing dynamics analysis shows competitive pressure has shortened safety evaluation timelines by 40-60% since ChatGPT's launch, with commercial labs reducing safety work from 12 weeks to 4-6 weeks....Quality: 72/100 → reduced safety testing → higher accident risk
- DisinformationRiskAI DisinformationPost-2024 analysis shows AI disinformation had limited immediate electoral impact (cheap fakes used 7x more than AI content), but creates concerning long-term epistemic erosion with 82% higher beli...Quality: 54/100 → trust declineRiskTrust DeclineUS government trust declined from 73% (1958) to 17% (2025), with AI deepfakes projected to reach 8M by 2025 accelerating erosion through the 'liar's dividend' effect—where synthetic content possibi...Quality: 55/100 → reduced coordination capacityAi Transition Model ParameterCoordination CapacityThis page contains only a React component reference with no actual content rendered in the provided text. Unable to evaluate coordination capacity analysis without the component's output.
- Power concentration → lock-inRiskLock-inComprehensive analysis of AI lock-in scenarios where values, systems, or power structures become permanently entrenched. Documents evidence including Big Tech's 66-70% cloud control, AI surveillanc...Quality: 64/100 potential → governance failures
See the Risk Interaction Matrix for detailed analysis.