Surprise Threat Exposure
Surprise Threat Exposure captures the risk from novel attack vectors that have not yet been anticipated—cases where AI enables entirely new categories of harm falling outside existing threat models. By definition, we cannot enumerate these threats precisely, making this parameter inherently difficult to assess but critically important.
The Warning Signs Model identifies 32 critical indicators, finding most are 18-48 months from threshold crossing with 45-90% detection probability. However, systematic tracking exists for fewer than 30% of warning signs, and pre-committed response protocols exist for fewer than 15%—revealing dangerous gaps. General resilience building emerges as the primary response strategy.
| Metric | Score | Notes |
|---|---|---|
| Changeability | 20 | Very difficult—cannot anticipate and prevent unknown threats |
| X-risk Impact | 70 | High—novel threats could bypass all existing defenses |
| Trajectory Impact | 55 | Moderate-high—could fundamentally alter AI development trajectory |
| Uncertainty | 85 | Very high—the fundamental nature makes assessment extremely difficult |
Related Content
Section titled “Related Content”Models:
Responses:
Key Debates:
- How should we reason about risks we cannot specify?
- Is general resilience the right approach, or should we try to anticipate specific novel threats?
- Can we detect novel AI-enabled threats early enough to respond?