Key Cruxes
Overview
Section titled “Overview”Cruxes are key uncertainties where different beliefs lead to substantially different conclusions about AI safety priorities. Identifying and tracking cruxes helps clarify what evidence would be most valuable and where reasonable people disagree.
Crux Categories
Section titled “Crux Categories”Uncertainties about unintended AI failures:
- Will advanced AI systems develop misaligned goals?
- Can we detect deceptive alignmentRiskDeceptive AlignmentComprehensive analysis of deceptive alignment risk where AI systems appear aligned during training but pursue different goals when deployed. Expert probability estimates range 5-90%, with key empir...Quality: 75/100 before deployment?
- How likely is mesa-optimizationRiskMesa-OptimizationMesa-optimization—where AI systems develop internal optimizers with different objectives than training goals—shows concerning empirical evidence: Claude exhibited alignment faking in 12-78% of moni...Quality: 63/100 in large models?
- Will AI systems seek power instrumentally?
Uncertainties about deliberate harmful use:
- How much do AI capabilities “uplift” bioweapon development?
- Will autonomous weaponsRiskAutonomous WeaponsComprehensive overview of lethal autonomous weapons systems documenting their battlefield deployment (Libya 2020, Ukraine 2022-present) with AI-enabled drones achieving 70-80% hit rates versus 10-2...Quality: 56/100 proliferate faster than defenses?
- Can AI-generated disinformationRiskAI DisinformationPost-2024 analysis shows AI disinformation had limited immediate electoral impact (cheap fakes used 7x more than AI content), but creates concerning long-term epistemic erosion with 82% higher beli...Quality: 54/100 be reliably detected?
Uncertainties about systemic dynamics:
- Will racing dynamicsRiskRacing DynamicsRacing dynamics analysis shows competitive pressure has shortened safety evaluation timelines by 40-60% since ChatGPT's launch, with commercial labs reducing safety work from 12 weeks to 4-6 weeks....Quality: 72/100 dominate lab behavior?
- Is value lock-inParameterValue Lock-inThis page contains only placeholder React components with no actual content about value lock-in scenarios or their implications for AI risk prioritization. a serious concern on realistic timelines?
- How concentrated will AI capabilities become?
Uncertainties about knowledge and truth:
- Will AI accelerate or undermine human epistemic capacity?
- Can authentication systems keep pace with generative AI?
- Will expertise atrophyRiskExpertise AtrophyExpertise atrophy—humans losing skills to AI dependence—poses medium-term risks across critical domains (aviation, medicine, programming), creating oversight failures when AI errs or fails. Evidenc...Quality: 65/100 be reversible?
Uncertainties about intervention effectiveness:
- Is alignment research on track to succeed?
- Can governance keep pace with capability development?
- Will international coordinationAi Transition Model ParameterInternational CoordinationThis page contains only a React component placeholder with no actual content rendered. Cannot assess importance or quality without substantive text. be achievable?
Using Cruxes
Section titled “Using Cruxes”Cruxes help:
- Prioritize research - Focus on resolving highest-value uncertainties
- Bridge disagreements - Identify where people actually differ
- Track progress - Monitor how uncertainties resolve over time
- Inform forecasting - Structure predictions around key variables