Misuse Risk Cruxes
Overview
Section titled “Overview”Misuse risk cruxes are the fundamental uncertainties that shape how policymakers, researchers, and organizations prioritize AI safety responses. These 13 cruxes determine whether AI provides meaningful “uplift” to malicious actors (30-45% say significant vs 35-45% modest), whether AI will favor offensive or defensive capabilities across security domains, and how effective various mitigation strategies can be.
Current evidence remains mixed across domains. The RAND biological uplift study↗ (January 2024) tested 15 red teams with and without LLM access, finding no statistically significant difference in bioweapon attack plan viability. However, RAND’s subsequent Global Risk Index for AI-enabled Biological Tools (2024) evaluated 57 state-of-the-art tools and indexed 13 as “Red” (action required), with one tool reaching the highest level of critical misuse-relevant capabilities. Meanwhile, CNAS analyses↗ and Georgetown CSET research emphasize that rapid capability improvements require ongoing reassessment.
In cybersecurity, OpenAI’s threat assessment (December 2025) notes that AI cyber capabilities improved from 27% to 76% on capture-the-flag benchmarks between August and November 2025, with 50% of critical infrastructure organizations reporting AI-powered attacks in the past year. Deepfake incidents grew from 500,000 files in 2023 to a projected 8 million by 2025, with businesses losing an average of $100,000 per deepfake-related fraud incident.
The stakes are substantial: if AI provides significant capability uplift to malicious actors, urgent restrictions on model access and compute governance become critical. If defenses can keep pace with offensive capabilities, investment priorities shift toward detection and response systems rather than prevention.
Risk Assessment Framework
Section titled “Risk Assessment Framework”| Risk Category | Severity Assessment | Timeline | Current Trend | Key Uncertainty |
|---|---|---|---|---|
| Bioweapons Uplift | High (if real) | 2-5 years | Mixed evidence | Wet-lab bottlenecks vs information barriers |
| Cyber Capability Enhancement | Medium-High | 1-3 years | Gradual increase | Commodity vs sophisticated attack gap |
| Autonomous Weapons | High | Ongoing | Accelerating | International cooperation effectiveness |
| Mass Disinformation | Medium-High | Current | Detection losing | Authentication adoption rates |
| Surveillance Authoritarianism | Medium | Ongoing | Expanding deployment | Democratic resilience factors |
| Chemical Weapons | Medium | 3-7 years | Early evidence | Synthesis barrier strength |
| Infrastructure Disruption | High | 1-4 years | Escalating complexity | Critical system vulnerabilities |
Source: Synthesis of expert assessments from CNAS↗, RAND Corporation↗, Georgetown CSET↗, and AI safety research organizations
Quantified Evidence Summary (2024-2025)
Section titled “Quantified Evidence Summary (2024-2025)”| Domain | Key Metric | Value | Source | Year |
|---|---|---|---|---|
| Bioweapons | Red teams with/without LLM access | No statistically significant difference | RAND Red-Team Study | 2024 |
| Bioweapons | AI bio-tools indexed as “Red” (high-risk) | 13 of 57 evaluated | RAND Global Risk Index | 2024 |
| Bioweapons | OpenAI o3 virology ranking | 94th percentile among expert virologists | OpenAI Virology Test | 2025 |
| Cyber | CTF benchmark improvement (GPT-5 to 5.1) | 27% to 76% | OpenAI Threat Assessment | 2025 |
| Cyber | Critical infrastructure AI attacks | 50% faced attack in past year | Microsoft Digital Defense Report | 2025 |
| Deepfakes | Content volume growth | 500K (2023) to 8M (2025) | Deepstrike Research | 2025 |
| Deepfakes | Avg. business loss per incident | ~$100,000 | Deloitte Financial Services | 2024 |
| Deepfakes | Fraud incidents involving deepfakes | >6% of all fraud | European Parliament Research | 2025 |
| Deepfakes | Human detection accuracy (video) | 24.5% | Academic studies | 2024 |
| Deepfakes | Tool detection accuracy | ~75% | UNESCO Report | 2024 |
| Disinformation | Political deepfakes documented | 82 cases in 38 countries | Academic research | 2024 |
| Fraud | Projected GenAI fraud losses (US) | $12.3B (2023) to $10B (2027) | Deloitte Forecast | 2024 |
Capability and Uplift Cruxes
Section titled “Capability and Uplift Cruxes”How much do AI systems lower barriers for dangerous capabilities?
Whether AI provides meaningful 'uplift' for malicious actors beyond what's already available through internet search, scientific literature, and existing tools.
Key Positions
Would Update On
- Rigorous red-team studies with real capability measurement
- Evidence of AI-enabled attacks in the wild
- Studies comparing AI-assisted vs non-AI-assisted malicious actors
- Domain-specific uplift assessments (bio, cyber, chemical)
Key Evidence on AI Capability Uplift
Section titled “Key Evidence on AI Capability Uplift”| Domain | Evidence For Uplift | Evidence Against Uplift | Quantified Finding | Current Assessment |
|---|---|---|---|---|
| Bioweapons | Kevin Esvelt warnings↗; OpenAI o3 at 94th percentile virology; 13/57 bio-tools at “Red” risk level | RAND study↗: no statistically significant difference in attack plan viability with/without LLMs | Wet-lab skills remain bottleneck; information uplift contested | Contested; monitoring escalating |
| Cyberweapons | CTF scores improved 27% to 76% (Aug-Nov 2025); 50% of critical infra faced AI attacks | High-impact attacks still require sophisticated skills and physical access | Microsoft 2025: nation-states using AI for lateral movement, vuln discovery | Moderate-to-significant uplift demonstrated |
| Chemical Weapons | Literature synthesis, reaction optimization | Physical synthesis and materials access remain bottleneck | Limited empirical studies; lower priority than bio | Limited evidence; lower concern |
| Disinformation | 8M deepfakes projected (2025); 1,740% fraud increase (N. America); voice phishing up 442% | Detection tools at ~75% accuracy; authentication standards emerging | Human detection only 24.5% for video deepfakes | Significant uplift clearly demonstrated |
| Surveillance | Enhanced facial recognition, behavioral analysis; PLA using AI for 10,000 scenarios in 48 seconds | Privacy protection tech advancing; democratic resilience | Freedom House: expanding global deployment | Clear uplift for monitoring |
Does AI meaningfully increase bioweapons risk?
Whether AI-assisted bioweapons development poses significantly higher risk than traditional paths to bioweapons.
Key Positions
Would Update On
- Evidence of AI being used in bio attacks
- Comprehensive wet-lab bottleneck analysis
- Improvement in AI Biological Design Tools
- DNA synthesis screening effectiveness data
Does AI meaningfully increase cyber attack capability?
Whether AI significantly enhances offensive cyber capabilities for individual attackers or small groups.
Key Positions
Would Update On
- AI-generated exploits being used in the wild
- Evidence on AI use in state-sponsored cyber operations
- AI vulnerability discovery capabilities
- Red team assessments of AI cyber capabilities
Offense vs Defense Balance
Section titled “Offense vs Defense Balance”Cyber Domain Assessment
Section titled “Cyber Domain Assessment”| Capability | Offensive Potential | Defensive Potential | Current Balance | Trend | Evidence |
|---|---|---|---|---|---|
| Vulnerability Discovery | High - CTF scores 27%->76% (3 months) | Medium - AI-assisted patching | Favors offense | Accelerating | OpenAI 2025 |
| Social Engineering | Very High - voice phishing up 442% | Low - human factor remains | Strongly favors offense | Widening gap | 49% of businesses report deepfake fraud |
| Incident Response | Low | High - automated threat hunting | Favors defense | Strengthening | $1B+ annual AI cybersecurity investment |
| Malware Development | Medium - autonomous malware adapting in real-time | High - behavioral detection | Roughly balanced | Evolving | Microsoft 2025 DDR |
| Attribution | Medium - obfuscation tools | High - pattern analysis | Favors defense | Improving | State actors experimenting (CN, RU, IR, NK) |
The cyber landscape is evolving rapidly. According to Microsoft’s 2025 Digital Defense Report, adversaries are increasingly using generative AI for scaling social engineering, automating lateral movement, discovering vulnerabilities, and evading security controls. Chinese, Russian, Iranian, and North Korean cyber actors are already integrating AI to enhance their operations.
Source: CyberSeek↗ workforce data, MITRE ATT&CK↗ framework, and OpenAI threat assessment
Will AI favor offense or defense in security domains?
Whether AI will primarily benefit attackers or defenders across security domains (cyber, bio, physical).
Key Positions
Would Update On
- Evidence from AI deployment in cybersecurity
- Domain-specific offense/defense analysis
- Historical analysis of technology and offense/defense balance
- Real-world outcomes of AI-enabled attacks vs defenses
Can AI-powered detection match AI-powered disinformation generation?
Whether AI systems for detecting synthetic content and disinformation can keep pace with AI generation capabilities.
Key Positions
Would Update On
- Advances in deepfake detection that generalize
- Real-world detection accuracy over time
- Theoretical analysis of detection vs generation
- Adversarial testing results
Deepfake and Disinformation Metrics (2024-2025)
Section titled “Deepfake and Disinformation Metrics (2024-2025)”| Metric | Value | Trend | Source |
|---|---|---|---|
| Deepfake video growth | 550% increase (2019-2024); 95,820 videos (2023) | Accelerating | Deepstrike 2025 |
| Projected synthetic content | 90% of online content by 2026 | Europol estimate | European Parliament |
| Human detection accuracy (video) | 24.5% | Asymmetrically low | Academic studies |
| Human detection accuracy (images) | 62% | Moderate | Academic studies |
| Tool detection accuracy | ~75% | Arms race dynamic | UNESCO |
| Confident in detection ability | Only 9% of adults | Public awareness gap | Surveys |
| Political deepfakes documented | 82 cases across 38 countries (mid-2023 to mid-2024) | Increasing | Academic research |
| North America fraud increase | 1,740% | Dramatic acceleration | Industry reports |
| Voice phishing increase | 442% (late 2024) | Driven by voice cloning | ZeroThreat |
The detection gap is widening: while deepfake generation has become dramatically easier, human ability to detect synthetic content remains critically low. Only 0.1% of participants across modalities could reliably spot fakes in mixed tests, according to UNESCO research. This asymmetry strongly supports investing in provenance-based authentication systems like C2PA rather than relying on detection alone.
Mitigation Effectiveness
Section titled “Mitigation Effectiveness”Model Restriction Approaches
Section titled “Model Restriction Approaches”| Restriction Type | Implementation Difficulty | Circumvention Difficulty | Effectiveness Assessment | Current Deployment |
|---|---|---|---|---|
| Training-time Safety | Medium | High | Moderate - affects base capabilities | Constitutional AI↗ |
| Output Filtering | Low | Low | Low - easily bypassed | Most commercial APIs |
| Fine-tuning Prevention | High | Medium | High - but open models complicate | Limited implementation |
| Access Controls | Medium | Medium | Moderate - depends on enforcement | OpenAI↗ terms |
| Weight Security | High | High | Very High - if enforceable | Early development |
Source: Analysis of current AI lab practices and jailbreak research
Can AI model restrictions meaningfully reduce misuse?
Whether training-time safety measures, output filters, and terms of service can prevent determined misuse of AI systems.
Key Positions
Would Update On
- Evidence on jailbreak prevalence and sophistication
- Success of restriction improvements
- Open model availability and capability trends
- Evidence of restrictions preventing real attacks
Should powerful AI models be open-sourced?
Whether the benefits of open AI (research, democratization, competition) outweigh misuse risks.
Key Positions
Would Update On
- Evidence of open model misuse in serious attacks
- Research enabling from open models vs closed
- Capability comparisons: open vs closed frontier
- Security of closed model weights
Can compute governance effectively limit dangerous AI development?
Whether controlling access to AI training compute can prevent dangerous capabilities from reaching bad actors.
Key Positions
Would Update On
- Effectiveness of chip export controls
- Development of compute monitoring technologies
- Algorithmic efficiency gains reducing compute requirements
- International coordination on compute governance
Will content authentication standards achieve adoption?
Whether provenance standards like C2PA will be adopted widely enough to create a trusted content ecosystem.
Key Positions
Would Update On
- Major platform (Meta, TikTok, X) full adoption
- Camera manufacturer widespread integration
- Evidence users value/check credentials
- Authentication system compromises or gaming
Actor and Intent Analysis
Section titled “Actor and Intent Analysis”Threat Actor Capabilities
Section titled “Threat Actor Capabilities”| Actor Type | AI Access Level | Sophistication | Primary Threat Vector | Risk Assessment | Deterability |
|---|---|---|---|---|---|
| Nation-States | High | Very High | Cyber, surveillance, weapons | Highest capability | High - diplomatic consequences |
| Terror Groups | Medium | Medium | Mass casualty, propaganda | Moderate capability | Low - ideological motivation |
| Criminals | High | Medium | Fraud, ransomware | High volume | Medium - profit motive |
| Lone Actors | High | Variable | Depends on AI uplift | Most unpredictable | Very Low - no clear target |
| Corporate Espionage | High | High | IP theft, competitive intelligence | Moderate-High | Medium - business interests |
Source: FBI Cyber Division↗ threat assessments and CSIS Critical Questions↗
Who are the most concerning actors for AI misuse?
Whether nation-states, terrorist groups, or lone actors pose the greatest AI misuse risk.
Key Positions
Would Update On
- Evidence of AI use in attacks by different actor types
- Capability requirements for AI-enabled attacks
- Analysis of actor motivations and AI access
- Historical patterns of technology-enabled terrorism
Are autonomous weapons inevitable?
Whether military adoption of AI for lethal autonomous weapons systems will happen regardless of international efforts to restrict them.
Key Positions
Would Update On
- Progress or failure of UN autonomous weapons negotiations
- Major powers' autonomous weapons deployment decisions
- Technical feasibility of meaningful restrictions
- Incidents involving autonomous weapons
International Autonomous Weapons Governance Status (2024-2025)
Section titled “International Autonomous Weapons Governance Status (2024-2025)”| Development | Status | Key Actors | Implications |
|---|---|---|---|
| UN General Assembly Resolution | Passed Dec 2024 (166-3; Russia, North Korea, Belarus opposed) | UN member states | Strong international momentum; not legally binding |
| CCW Group of Governmental Experts | 10 days of sessions (Mar 3-7, Sep 1-5, 2025) | High Contracting Parties | Rolling text from Nov 2024 outlines regulatory measures |
| Treaty Goal | Target completion by end of 2026 | UN Sec-Gen Guterres, ICRC President Spoljaric | Ambitious timeline; window narrowing |
| US Position | Governance framework via DoD 2020 Ethical Principles; no ban | US DoD | Responsible, traceable, governable AI within human command |
| China Position | Ban on “unacceptable” LAWS (lethal, autonomous, unterminating, indiscriminate, self-learning) | China delegation | Partial ban approach; “acceptable” LAWS permitted |
| Existing Systems | Phalanx CIWS (1970s), Iron Dome, Trophy, sentry guns (S. Korea, Israel) | Various militaries | Precedent of autonomous targeting for decades |
According to Congressional Research Service analysis, the U.S. does not prohibit LAWS development or employment, and some senior defense leaders have stated the U.S. may be compelled to develop such systems. The ASIL Insights notes growing momentum toward a new international treaty, though concerns remain about the rapidly narrowing window for effective regulation.
Impact and Scale Assessment
Section titled “Impact and Scale Assessment”Mass Casualty Attack Scenarios
Section titled “Mass Casualty Attack Scenarios”| Attack Vector | AI Contribution | Casualty Potential | Probability (10 years) | Key Bottlenecks | Historical Precedents |
|---|---|---|---|---|---|
| Bioweapons | Pathogen design, synthesis guidance | Very High (>10k) | 5-15% | Wet-lab skills, materials access | Aum Shinrikyo (failed), state programs |
| Cyberweapons | Infrastructure targeting, coordination | High (>1k) | 15-25% | Physical access, critical systems | Stuxnet, Ukraine grid attacks |
| Chemical Weapons | Synthesis optimization | Medium (>100) | 10-20% | Materials access, deployment | Tokyo subway, Syria |
| Conventional | Target selection, coordination | Medium (>100) | 20-30% | Physical access, materials | Oklahoma City, 9/11 |
| Nuclear | Security system exploitation | Extreme (>100k) | 1-3% | Fissile material access | None successful (non-state) |
Probability estimates based on Global Terrorism Database↗ analysis and expert elicitation
How likely is AI-enabled mass casualty attack in next 10 years?
Whether AI will enable attacks causing over 1,000 deaths within the next decade.
Key Positions
Would Update On
- AI-enabled attacks occurring (or not occurring)
- Capability assessments over time
- Evidence on attacker intentions and AI access
- Defensive capability improvements
Will AI-enabled surveillance strengthen or weaken authoritarian regimes?
Whether AI surveillance and control tools will make authoritarian regimes more stable and durable.
Key Positions
Would Update On
- Evidence on AI surveillance effects on regime stability
- Protests/revolutions succeeding despite AI surveillance
- Comparative studies of surveillance and regime type
- AI tools enabling opposition movements
Current State & Trajectory
Section titled “Current State & Trajectory”Near-term Developments (2025-2027)
Section titled “Near-term Developments (2025-2027)”| Development Area | Current Status (Dec 2025) | Expected Trajectory | Key Factors |
|---|---|---|---|
| Model Capabilities | GPT-5 level; o3 at 94th percentile virology; CTF 76% | Human-level in multiple specialized domains | Scaling laws, algorithmic improvements |
| Defense Investment | $2B+ annual cybersecurity AI; 3-5x growth occurring | Major enterprise adoption | 50% of critical infra already attacked |
| Regulatory Response | EU AI Act↗ in force; LAWS treaty negotiations | Treaty target by 2026; federal US legislation likely | Political pressure, incident triggers |
| Open Source Models | Llama 3, DeepSeek-R1 (Jan 2025) | Continued but contested growth | Cost breakthroughs, safety concerns |
| Compute Governance | Export controls tightening; monitoring emerging | International coordination increasing | US-China dynamics, evasion attempts |
| Deepfake Response | 8M projected files; C2PA adoption growing | Provenance-based authentication scaling | Platform adoption critical |
| AI Misuse Detection | OpenAI, Microsoft publishing threat reports | Real-time monitoring becoming standard | Provider cooperation essential |
Medium-term Projections (2026-2030)
Section titled “Medium-term Projections (2026-2030)”- Capability Thresholds: Models approaching human performance in specialized domains like biochemistry and cybersecurity
- Defensive Maturity: AI-powered detection and response systems become standard across critical infrastructure
- Governance Infrastructure: Compute monitoring systems deployed, international agreements on autonomous weapons
- Attack Sophistication: First sophisticated AI-enabled attacks likely demonstrated, shifting threat perceptions significantly
Long-term Uncertainty (2030+)
Section titled “Long-term Uncertainty (2030+)”Key trajectories that remain highly uncertain:
| Trend | Optimistic Scenario | Pessimistic Scenario | Key Determinants |
|---|---|---|---|
| Capability Diffusion | Controlled through governance | Widespread proliferation | International cooperation success |
| Offense-Defense Balance | Defense keeps pace | Offense advantage widens | R&D investment allocation |
| Authentication Adoption | Universal verification | Fragmented ecosystem | Platform cooperation |
| International Cooperation | Effective regimes emerge | Fragmentation and competition | Geopolitical stability |
Key Uncertainties & Expert Disagreements
Section titled “Key Uncertainties & Expert Disagreements”Technical Uncertainties
Section titled “Technical Uncertainties”| Uncertainty | Range of Views | Current Evidence | Resolution Timeline |
|---|---|---|---|
| LLM biological uplift | No uplift (RAND 2024) vs. concerning (CSET, Esvelt) | Mixed; wet-lab bottleneck may dominate | 2-5 years as capabilities improve |
| AI cyber capability ceiling | Commodity attacks only vs. sophisticated intrusions | CTF benchmarks improving rapidly (27%->76%) | 1-3 years; being resolved now |
| Deepfake detection viability | Arms race favoring offense vs. provenance solutions | Human detection at 24.5%; tools at 75% | 2-4 years; depends on C2PA adoption |
| Open model misuse potential | Democratization benefits vs. misuse risks | DeepSeek-R1 cost breakthrough; no catastrophic misuse yet | Ongoing; each release re-evaluated |
Policy Uncertainties
Section titled “Policy Uncertainties”| Uncertainty | Range of Views | Current Evidence | Resolution Timeline |
|---|---|---|---|
| Compute governance effectiveness | Strong chokepoint vs. easily circumvented | Export controls having effect; evasion ongoing | 3-5 years as enforcement matures |
| LAWS treaty feasibility | Treaty achievable by 2026 vs. inevitable proliferation | UN resolution 166-3; CCW negotiations ongoing | 2026 target deadline |
| Model restriction value | Meaningful reduction vs. security theater | Jailbreaks common; open models exist | Ongoing empirical question |
| Authentication adoption | Universal adoption vs. fragmented ecosystem | C2PA growing; major platforms uncommitted | 3-5 years for critical mass |
Expert Disagreement Summary
Section titled “Expert Disagreement Summary”The AI safety and security community remains divided on several fundamental questions. According to Georgetown CSET’s assessment framework, these disagreements stem from genuine uncertainty about rapidly evolving capabilities, differing risk tolerances, and varying assumptions about attacker sophistication and motivation.
Key areas of active debate include:
-
Bioweapons uplift magnitude: RAND’s 2024 red-team study found no significant uplift, but their Global Risk Index identified 13 high-risk biological AI tools. OpenAI’s o3 model scoring at the 94th percentile among virologists suggests capabilities are advancing.
-
Offense-defense balance: OpenAI’s threat assessment acknowledges planning for models reaching “High” cyber capability levels that could develop zero-day exploits or assist with complex intrusions. Meanwhile, defensive AI investment is growing rapidly.
-
Regulatory approach: The U.S. DoD favors governance frameworks over bans for LAWS, while 166 UN member states voted for a resolution calling for action. China distinguishes “acceptable” from “unacceptable” autonomous weapons.
Key Sources and References
Section titled “Key Sources and References”Primary Research Sources
Section titled “Primary Research Sources”| Source | Organization | Key Publications | Focus Area |
|---|---|---|---|
| RAND Corporation | Independent research | Biological Red-Team Study (2024); Global Risk Index (2024) | Bioweapons, defense |
| Georgetown CSET | University research center | Malicious Use Assessment Framework; Mechanisms of AI Harm (2025) | Policy, misuse assessment |
| OpenAI | AI lab | Cyber Resilience Report (2025); Threat Assessment | Cyber, capabilities |
| Microsoft | Technology company | Digital Defense Report (2025) | Cyber threats, state actors |
| CNAS | Think tank | AI and National Security Reports | Military, policy |
International Governance Sources
Section titled “International Governance Sources”| Source | Focus | Key Documents |
|---|---|---|
| UN CCW GGE on LAWS | Autonomous weapons | Rolling text (Nov 2024); 2025 session schedules |
| ICRC | International humanitarian law | Autonomous Weapons Position Papers |
| Congressional Research Service | US policy | LAWS Policy Primer |
| ASIL | International law | Treaty Momentum Analysis (2025) |
Deepfake and Disinformation Sources
Section titled “Deepfake and Disinformation Sources”| Source | Focus | Key Findings |
|---|---|---|
| Deepstrike Research | Statistics | 8M deepfakes projected (2025); 550% growth (2019-2024) |
| UNESCO | Detection | 24.5% human detection accuracy; 0.1% reliable identification |
| European Parliament | Policy | Europol 90% synthetic content projection by 2026 |
| C2PA Coalition | Provenance | Content authenticity standards |
| Deloitte Financial Services | Financial impact | $12.3B to $10B fraud projection (2023-2027) |