Misalignment Potential
- StructureNo tables or diagrams - consider adding visual content
Misalignment Potential
Misalignment Potential measures the likelihood that AI systems will pursue goals other than what we intend. This aggregate combines the technical and organizational factors that determine whether advanced AI systems might behave in harmful ways despite our efforts.
Primary outcome affected: Existential CatastropheAi Transition Model ScenarioExistential CatastropheThis page contains only a React component placeholder with no actual content visible for evaluation. The component would need to render content dynamically for assessment. ↑↑↑
When misalignment potential is high, catastrophic loss of control, accidents at scale, and goal divergence become more likely. Reducing this potential is the most direct lever for reducing existential and catastrophic AI risk.
Component Parameters
Section titled “Component Parameters”Internal Dynamics
Section titled “Internal Dynamics”These components interact:
- Interpretability enables alignment verification: We can only confirm alignment if we understand model internals
- Safety culture sustains investment: Without organizational commitment, safety research loses funding to capabilities
- Oversight requires interpretability: Human overseers need tools to understand what systems are doing
- Gap closure requires all components: No single factor is sufficient; safety capacity emerges from their combination
How This Affects Scenarios
Section titled “How This Affects Scenarios”Related Pages
Section titled “Related Pages”- Existential CatastropheAi Transition Model ScenarioExistential CatastropheThis page contains only a React component placeholder with no actual content visible for evaluation. The component would need to render content dynamically for assessment. — The outcome this primarily affects
- Misuse PotentialFactors Misuse Potential OverviewRoot factor measuring the risk of AI being weaponized or exploited by malicious actors. Primary driver of Human-Caused Catastrophe scenarios. — The complementary factor for human-caused catastrophe
What Drives Misalignment Potential?
The three pillars of alignment assurance, their drivers, and key uncertainties.
What links here
- AI Capabilitiesai-transition-model-factoramplifies
- Existential Catastropheai-transition-model-scenariodriver
- AI Takeoverai-transition-model-scenariodriven-by