Skip to content

Bioweapons

📋Page Status
Quality:78 (Good)
Importance:82.5 (High)
Last edited:2025-12-28 (10 days ago)
Words:9.9k
Backlinks:8
Structure:
📊 19📈 2🔗 85📚 435%Score: 13/15
LLM Summary:Comprehensive analysis of AI-assisted bioweapons risk finding 0.02%-3.6% compound attack probability with evidence shifting in 2025. OpenAI expects next-gen models to hit 'high-risk' classification; Anthropic activated ASL-3 for Claude Opus 4 over CBRN concerns. RAND 2024 found no significant AI uplift, but 2025 developments show frontier models approaching expert-level virology capabilities.
Risk

Bioweapons Risk

Importance82
CategoryMisuse Risk
SeverityCatastrophic
Likelihoodmedium
Timeframe2027
MaturityGrowing
TypeMisuse
Key ConcernLowering barriers to development
Related
Organizations

AI systems could accelerate biological weapons development by helping with pathogen design, synthesis planning, or acquisition of dangerous knowledge. The concern isn’t that AI creates entirely new risks, but that it lowers barriers—making capabilities previously requiring rare expertise more accessible to bad actors.

This is considered one of the most severe near-term AI risks because biological weapons can cause mass casualties and AI-assisted bioweapons could be developed by smaller groups than traditional state programs required. Unlike many other AI risks that depend on future, more capable systems, this risk applies to models available today.

The key debate centers on whether AI provides meaningful “uplift”—whether it genuinely helps beyond what’s already accessible through scientific literature and internet searches, or whether wet-lab skills remain the true bottleneck. Current evidence is mixed: the RAND Corporation’s 2024 study found no statistically significant AI uplift for attack planning, while Microsoft research showed AI-designed toxins evading 75%+ of DNA synthesis screening.

However, 2025 has marked a significant shift in official assessments. OpenAI now expects its next-generation models to reach “high-risk classification” for biological capabilities—meaning they could provide “meaningful counterfactual assistance to novice actors.” Anthropic activated ASL-3 (AI Safety Level 3) protections for Claude Opus 4 specifically due to biological and chemical weapon concerns. The National Academies’ March 2025 report “The Age of AI in the Life Sciences” found that while current biological design tools cannot yet design self-replicating pathogens, monitoring and mitigation are urgently needed.

DimensionAssessmentNotes
SeverityHigh to CatastrophicBiological weapons can cause mass casualties; worst-case scenarios involve engineered pandemics
LikelihoodUncertainCurrent evidence is mixed on AI uplift; capabilities are rapidly improving
TimelineNear-termUnlike many AI risks, this concern applies to current systems
TrendIncreasingEach model generation shows more biological knowledge; screening gaps persist
WindowTemporaryAI may eventually favor defense (surveillance, vaccines, countermeasures); risk elevated during transition period
ResponseMechanismEffectiveness
Responsible Scaling Policies (RSPs)Internal biosecurity evaluations before deploymentMedium
Compute GovernanceLimits access to training resources for dangerous modelsMedium
US AI Chip Export ControlsRestricts AI chip exports to adversary nationsLow-Medium
AI Safety Institutes (AISIs)Government evaluation of biosecurity risksMedium
Voluntary AI Safety CommitmentsLab pledges on dangerous capability evaluationLow

How dangerous is AI-assisted bioweapons development? Expert assessments vary substantially, from those who consider it an imminent catastrophic threat to those who view it as overhyped. Understanding both sides of this debate—and the key uncertainties that drive disagreement—is essential for calibrating policy responses.

Attempting to quantify the total risk from AI-assisted bioweapons requires estimating both the probability of an attack and its potential consequences. Estimates vary widely:

Estimate TypeRangeSource/BasisKey Assumptions
Annual probability of catastrophic AI-assisted bio attack0.01% - 0.5%Expert elicitation, attack chain analysis”Catastrophic” = 10,000+ casualties
Cumulative probability through 20400.1% - 8%Timeline projectionsDepends heavily on AI capability trajectory
Expected casualties if attack occurs10,000 - 10M+Historical/scenario analysisVaries by pathogen, deployment method, response
Expected value of harm per year$1B - $500BProbability × consequence estimatesExtremely uncertain

The Bioweapons Attack Chain Model estimates compound attack probability at 0.02% - 3.6% depending on assumptions, with substantial uncertainty at each step. The wide range reflects genuine disagreement about key parameters.

Existential risk context: In The Precipice, Oxford philosopher Toby Ord estimates the chance of existential catastrophe from engineered pandemics at 1 in 30 by 2100—second only to AI among anthropogenic risks. While not all engineered pandemics would be AI-assisted, this frames the potential severity. Ord notes that it “now seems within the reach of near-term biological advances to create pandemics that would kill greater than 50% of the population—not just in a particular area, but globally.”

Industry concerns: In July 2023, Anthropic CEO Dario Amodei stated that within two to three years, there was a “substantial risk” that AI tools would “greatly widen the range of actors with the technical capability to conduct a large-scale biological attack.” The CNAS report notes this could “expose the United States to catastrophic threats far exceeding the impact of COVID-19.”

Those who consider AI-bioweapons a severe threat emphasize several points:

AI makes dangerous biological knowledge more accessible to those who couldn’t previously obtain it. While scientific literature contains detailed protocols, navigating it requires expertise. AI systems can synthesize, explain, and contextualize this information for non-experts, potentially expanding the pool of capable actors.

The equalizer effect: The most concerning scenario isn’t AI helping expert virologists (who already have the knowledge), but AI helping moderately skilled individuals bridge knowledge gaps that previously required years of training or team collaboration.

Microsoft’s 2024 research revealed that AI-designed toxins evaded over 75% of commercial DNA synthesis screening tools. This is qualitatively different from knowledge provision—it represents AI helping attackers circumvent existing defenses.

DNA synthesis screening is a cornerstone of current biosecurity. If AI can reliably design functional variants that evade detection, the entire screening paradigm may become obsolete faster than new defenses can be developed. This creates an asymmetric threat where even modest AI capabilities could undermine years of defensive investment.

AI capabilities are improving rapidly. Even if current models provide limited uplift, the trend is concerning:

CapabilityGPT-4 (2023)Claude 3.5/GPT-4o (2024)Claude Opus 4/o3 (2025)Trend
Biology knowledgeHighVery HighExpert-levelRapidly increasing
Synthesis planningModerateModerate-HighHighIncreasing
Evading guardrailsModerateLow-ModerateLow (frontier models)Variable by model
Integration with toolsLimitedGrowingSubstantialAccelerating

2025 milestone: OpenAI’s April 2025 o3 model ranked in the 94th percentile among expert human virologists on the Virology Capabilities Test. This is the first time an AI model has demonstrated expert-level performance on biological troubleshooting scenarios.

The argument is that we should prepare for future capabilities, not just current ones. By the time AI demonstrably provides high uplift, it may be too late to establish governance.

AI alone may provide limited uplift, but the combination of multiple technologies could be transformative:

Loading diagram...
  • LLMs + protein design tools: AlphaFold and similar tools enable novel protein engineering; LLMs help identify targets and plan applications
  • AI + lab automation: Automated systems could eventually execute protocols with minimal human intervention
  • AI + decreasing synthesis costs: DNA synthesis costs continue falling; AI could help design sequences optimized for cheap synthesis

Each technology alone may be manageable, but their combination could create emergent risks that exceed any individual contribution.

Even if the median expectation is manageable, the worst-case scenarios are severe enough to warrant serious attention:

  • Engineered pandemic: A pathogen designed for transmissibility, lethality, and immune evasion could potentially cause millions of deaths
  • Multiple simultaneous attacks: AI could enable coordination of attacks across multiple locations
  • Degradation of trust in biology: Widespread bioterrorism could undermine beneficial biological research and public health

From a risk management perspective, low-probability/high-consequence events may deserve more weight than their expected value alone suggests.

History suggests we systematically underestimate technology-enabled threats:

  • Nuclear weapons were developed faster than many expected
  • COVID-19 demonstrated how disruptive novel pathogens can be
  • AI capabilities have repeatedly exceeded forecasts

Skepticism about AI-bioweapons risk may itself be the risky position.

Multiple emerging technologies are simultaneously reducing the skill requirements for biological research:

  • Cloud laboratories automate complex procedures and allow remote execution
  • Benchtop DNA synthesizers are approaching gene-length capabilities
  • AI assistants bridge knowledge gaps and provide troubleshooting guidance
  • Protocol automation reduces the need for tacit laboratory knowledge

Each of these alone might be manageable, but together they suggest a trajectory toward dramatically lowered barriers. The RAND study may capture a snapshot where these technologies haven’t yet converged—but convergence appears likely within the decade.

Biological attacks have inherent asymmetric characteristics that favor attackers:

  • Attribution lag: Days to weeks may pass before an attack is recognized as intentional
  • Preparation asymmetry: Attackers can prepare countermeasures for themselves; defenders must protect everyone
  • Innovation asymmetry: Attackers need to succeed once; defenders must anticipate all possible attack vectors
  • Psychological impact: Even unsuccessful or small-scale attacks could cause massive economic and social disruption

AI amplifies these asymmetries by potentially enabling novel attack vectors that existing defenses haven’t anticipated.

Even if frontier labs implement strong biosecurity measures, the proliferation of open-source models undermines containment:

  • No centralized control: Once weights are released, restrictions cannot be enforced
  • Fine-tuning vulnerability: Safety training can be removed with relatively modest compute
  • Capability improvements: Open models are approaching frontier capabilities with 6-12 month lags
  • Global availability: Actors in any jurisdiction can access open models

The CNAS report recommends considering a “licensing regime for biological design tools with potentially catastrophic capabilities”—but this is not currently implemented.

The DeepSeek warning: In February 2025, Anthropic CEO Dario Amodei reported that testing of China’s DeepSeek model revealed it was “the worst of basically any model we’d ever tested” for biosecurity—generating information critical to producing bioweapons “that can’t be found on Google or can’t be easily found in textbooks” with “absolutely no blocks whatsoever.” While Amodei did not consider DeepSeek “literally dangerous” yet, the incident highlighted how open-source models from different jurisdictions may not implement equivalent safety measures.

Those who consider AI-bioweapons risk overstated emphasize different considerations:

The RAND Corporation’s 2024 study is the most rigorous empirical assessment of AI uplift to date. Twelve teams of three researchers each spent 80 hours developing bioweapon attack plans—half using AI, half using only the internet. Expert evaluators found no statistically significant difference in plan viability.

This finding directly challenges claims that AI meaningfully assists biological attacks. If AI-assisted and non-AI teams perform equally, the AI “threat” may be largely illusory.

GroupInformation QualityPlan ViabilityNoveltyStatistical Significance
AI-assistedHighModerateLown/a
Internet-onlyHighModerateLown/a
DifferenceMinimalMinimalNoneNot significant

Implications: Dangerous biological information is already widely accessible through legitimate scientific literature. AI may be redundant with existing sources rather than providing novel dangerous capabilities.

Knowledge is not capability. Even with complete theoretical understanding, executing biological synthesis requires:

  • Tacit knowledge that transfers poorly through text (how to handle contamination, optimize growth conditions, troubleshoot failures)
  • Specialized equipment that is expensive, regulated, and hard to obtain
  • Months of practice to develop reliable technique
  • Physical safety procedures that untrained individuals typically violate

The Soviet Biopreparat program employed thousands of scientists for decades to develop reliable bioweapons. Aum Shinrikyo, despite substantial resources and scientific personnel, failed in their bioweapons attempts. The knowledge bottleneck may be much less important than the capability bottleneck.

AI cannot transfer tacit knowledge. Reading about sterile technique is different from maintaining it. AI can explain protocols but cannot teach hands-on skills.

Frontier AI models include safety measures that reduce dangerous information provision:

  • Refusals for explicitly harmful requests
  • Content filtering
  • Constitutional AI and RLHF training
  • Continuous red-teaming and patching

While not perfect, these measures raise barriers. Jailbreaking techniques exist but require effort, sophistication, and often produce degraded responses. The marginal attacker may be more likely to use open internet resources than to navigate AI guardrails.

Scientific literature already contains dangerous information. Textbooks explain pathogen biology. The internet hosts synthesis protocols. Dark web forums discuss dangerous techniques.

The marginal information contribution of AI may be minimal when the baseline is “everything is already out there.” AI’s value proposition is synthesis and accessibility, but motivated individuals were already able to find this information through traditional means.

AI capabilities benefit defense as much as offense, and defensive applications are more scalable:

ApplicationOffense ContributionDefense ContributionNet Balance
Pathogen detectionMarginalSubstantialDefense
Vaccine developmentMarginalTransformativeStrong defense
Synthesis planningModerateMinimalOffense
Countermeasure designMarginalSubstantialDefense
SurveillanceNoneSubstantialStrong defense
Treatment optimizationNoneSubstantialStrong defense

Metagenomic surveillance, mRNA vaccine platforms, and AI-assisted drug discovery are advancing rapidly. These defensive technologies may ultimately make biological attacks less effective rather than more dangerous.

The transition period concern: Even those who believe defense wins long-term often worry about a near-term window where offense temporarily gains advantages before defenses mature.

Biological attacks, especially sophisticated ones, leave traces that can enable attribution:

  • Genomic sequencing of pathogens
  • Epidemiological tracking
  • Intelligence on precursor purchases
  • Surveillance of likely actors

State actors face retaliation risks. Non-state actors face intense investigative focus. The certainty of attribution for significant attacks provides deterrent effect that pure capability analysis misses.

Despite decades of concern, catastrophic bioterrorism has not occurred:

  • The 2001 anthrax attacks killed 5 people—tragic, but not catastrophic
  • No terrorist group has successfully deployed a mass-casualty biological weapon
  • State bioweapons programs have not been used since WWII

This could reflect genuine difficulty rather than mere luck. The absence of catastrophic bioterrorism despite motivation and attempts suggests the barriers are higher than often assumed.

Catastrophic biological attacks require a specific combination of capability and motivation that is rare:

Who would want to cause a pandemic?

  • State actors: Have capabilities but face deterrence (attribution, retaliation risk, blowback to own population)
  • Terrorist groups: Most seek specific political goals; mass extinction doesn’t serve typical objectives
  • Lone actors: May have motivation but face significant capability barriers
  • Apocalyptic cults: Rare and typically incompetent (Aum Shinrikyo failed despite resources)

The overlap between “capable” and “wants maximum casualties” may be smaller than feared. Most capable actors (states, organized groups) have reasons not to deploy biological weapons; most actors who lack such reasons (doomsday cults, nihilistic lone actors) lack capability.

AI changes this calculus only if: It enables actors who previously lacked capability while retaining dangerous motivation—the “uplift for the unhinged” scenario. The RAND study suggests this isn’t happening yet.

Fundamental biological facts may favor defense over the long run:

  • Pathogens are detectable: All biological agents produce detectable signals (RNA, proteins, metabolic products)
  • Immune systems adapt: Evolution has produced robust immune defenses; vaccines enhance these
  • Countermeasures are general: mRNA platforms, broad-spectrum antivirals, and environmental controls work against many agents
  • Medical capacity scales: Unlike nuclear attacks, biological attacks unfold over time, allowing response

The “defense wins” scenario: Robust metagenomic surveillance detects outbreaks early; mRNA vaccines are developed in weeks; far-UVC limits airborne transmission; medical countermeasures limit casualties. In this world, even a successful synthesis and deployment might cause localized harm but not catastrophe.

Skeptics of this view note: Defense advantages assume functional institutions and may take years to fully deploy. The transition period—before defenses mature—may be the danger zone.

Exaggerating AI-bioweapons risk has potential costs:

  • Resource misallocation: Focusing on AI-specific interventions may divert resources from more effective biosecurity investments
  • Dual-use research chill: Overreaction could harm legitimate biological research
  • AI development restrictions: Excessive caution about biological capabilities could impede beneficial AI applications in medicine
  • Crying wolf: If claims of imminent AI-enabled bioweapons prove false, future warnings may be dismissed

Some critics argue the biosecurity community has incentives to emphasize threats to justify funding, and that healthy skepticism is appropriate.


Much of the disagreement about AI-bioweapons risk reduces to a small number of factual questions where reasonable people disagree:

Crux 1: Does AI Provide Meaningful Uplift?

Section titled “Crux 1: Does AI Provide Meaningful Uplift?”

If uplift is low (less than 1.5x): Focus resources on traditional biosecurity rather than AI-specific interventions. The threat is real but not qualitatively changed by AI.

If uplift is high (greater than 2x): Urgent need for AI-specific guardrails, compute governance, and model restrictions. The threat landscape has fundamentally shifted.

EvidenceFavors Low UpliftFavors High Uplift
RAND studyStrong
Screening evasion researchStrong
Model capability trendsModerate
Expert elicitationMixedMixed
Current assessmentFavored (65%)35%

Crux 2: Is the Knowledge Bottleneck or Capability Bottleneck More Important?

Section titled “Crux 2: Is the Knowledge Bottleneck or Capability Bottleneck More Important?”

If knowledge is the bottleneck: AI providing information is directly dangerous.

If capability is the bottleneck: AI providing information is mostly redundant with existing sources; wet lab skills remain rate-limiting.

EvidenceFavors Knowledge BottleneckFavors Capability Bottleneck
Historical bioterrorism failuresStrong
State program difficultyStrong
Information abundance onlineModerate
AI capability trendsModerate
Current assessment35%Favored (65%)

Crux 3: Will Defense or Offense Win Long-Term?

Section titled “Crux 3: Will Defense or Offense Win Long-Term?”

If defense wins: AI-bioweapons is a transitional problem that self-corrects as defensive applications mature.

If offense wins: AI permanently shifts the advantage to attackers, requiring sustained containment efforts.

If it’s a window: The near-term favors offense, but defense catches up—the question is whether catastrophic attacks occur during the transition.

ScenarioProbabilityImplications
Permanent offense advantage15%Maximum concern; sustained containment needed
Permanent defense advantage40%Eventually self-correcting; manage transition
Temporary window (5-10 years)35%Near-term urgency, medium-term resolution
Unclear/context-dependent10%Need robust strategies for multiple scenarios

Crux 4: How Quickly Are Capabilities Advancing?

Section titled “Crux 4: How Quickly Are Capabilities Advancing?”

If capabilities are saturating: Current systems represent near-peak dangerous capabilities; governance can catch up.

If capabilities continue scaling: Future systems will be substantially more dangerous; governance is racing against time.

The AI-Bioweapons Timeline Model projects capability thresholds, with synthesis assistance potentially arriving 2027-2032 and novel agent design 2030-2040.

Crux 5: How Effective Are Guardrails and Countermeasures?

Section titled “Crux 5: How Effective Are Guardrails and Countermeasures?”

If guardrails work well: The marginal risk from AI models is small; responsible development practices suffice.

If guardrails fail: Open-source proliferation and jailbreaking make model-level interventions largely ineffective.

FactorFavors GuardrailsFavors Guardrail Failure
Frontier model safety measuresModerate
Open-source model proliferationStrong
Jailbreaking researchModerate
Fine-tuning vulnerabilityModerate
Current assessmentPartially effective (40%)Limited effectiveness (60%)

The open-source challenge: Even if frontier labs implement strong safeguards, open-source models may not. As capable open models proliferate, guardrails become optional, fine-tuning can remove remaining restrictions, and dangerous capabilities become permanently accessible.

Crux 6: Can DNA Synthesis Screening Keep Pace?

Section titled “Crux 6: Can DNA Synthesis Screening Keep Pace?”

DNA synthesis screening is the primary defense against engineered pathogens, but Microsoft’s research revealed significant gaps.

If screening adapts: AI-designed evasion is a temporary problem; screening improvements restore the chokepoint.

If screening falls behind: The primary technical barrier erodes; other defenses must compensate.

Key questions:

  • Can screening adapt to AI-designed evasive sequences?
  • What happens as benchtop synthesis equipment becomes cheaper and more accessible?
  • Can screening extend to cover novel synthesis methods and cloud laboratories?

The Framework for Nucleic Acid Synthesis Screening (April 2024) represents a policy response, but only applies to federally funded programs.


Studies have shown language models can provide information relevant to bioweapon development, though the significance is contested.

The RAND Corporation study (“The Operational Risks of AI in Large-Scale Biological Attacks”) is the most rigorous empirical assessment of AI uplift to date. Researchers Christopher Mouton, Caleb Lucas, and Ella Guest recruited 15 groups of three people to act as red team “bad guys.”

Twelve teams were given 80 hours each over seven weeks to develop bioweapon attack plans based on one of four scenarios—including a “fringe doomsday cult intent on global catastrophe” and a “private military company seeking to aid an adversary’s conventional military operation.” For each scenario, one team had access to an LLM chatbot, another had a different chatbot, and control teams used only internet resources.

Expert judges (biologists and security specialists) evaluated the resulting plans for biological and operational feasibility. The result: no statistically significant difference in plan viability between AI-assisted and non-AI groups.

Key methodology details:

  • Participants had some technical background (science graduates)
  • Testing focused on planning, not actual synthesis
  • Used 2023-era models; capabilities have advanced since
  • Sample size was relatively small (n=12 teams completing the study)
  • LLMs did not generate explicit weaponization instructions, but provided “guidance and context in critical areas such as agent selection, delivery methods, and operational planning”

Limitations acknowledged by researchers: The study tested planning capability, not execution. It used participants with technical backgrounds, so may underestimate uplift for complete novices. And AI capabilities continue advancing.

Implications: The wet-lab bottleneck may be more significant than the knowledge bottleneck. Knowing how to make something is different from being able to make it.

Microsoft researchers conducted a red-team exercise testing biosecurity in the protein engineering pipeline. They found that DNA screening software—used by synthesis companies to flag dangerous sequences—missed over 75% of AI-designed potential toxins. One tool flagged only 23% of sequences. After the research was published, screening systems improved to catch 72% on average.

Key details:

  • Tested multiple commercial screening tools
  • AI designed functional variants that differed sufficiently from known threats to evade pattern matching
  • Improvement after publication shows screening can adapt—but also shows it wasn’t keeping pace

Implications: Even if current LLMs provide limited knowledge uplift, AI protein design tools may create harder-to-detect threats. The screening ecosystem has significant gaps that AI can exploit.

Anthropic hired Gryphon Scientific to spend more than 150 hours red-teaming Claude’s ability to provide harmful biological information. They created a rubric of several dozen questions probing critical knowledge gaps along the entire technical pathway to biological weapon development.

The findings were concerning. Rocco Casagrande, Gryphon’s managing director, stated he was “personally surprised and dismayed by how capable current LLMs were at providing critical information related to biological weapons.” He told Semafor: “These things are developing extremely, extremely fast, they’re a lot more capable than I thought they would be when it comes to science.”

Key findings:

  • One team member with a postdoctoral fellowship studying a pandemic-capable virus found LLMs could provide “post-doc level knowledge to troubleshoot commonly encountered problems” when working with that virus
  • For low-skill users, LLMs could suggest which viruses to acquire
  • Although LLMs often hallucinate, they answered almost all questions accurately at least sometimes, and answered some critical questions nearly always accurately
  • Gryphon workshops with 20+ biosecurity experts identified concerning misuse scenarios including “how to collapse an ecosystem” and “reconstruct information redacted from sensitive scientific documents”

Despite the concerning findings, Casagrande believes “concerted action could ensure safety is built into the most advanced models.”

AI labs have conducted extensive internal evaluations testing whether their models could provide “uplift” to potential bioweapon developers. These evaluations are becoming more sophisticated and more alarming.

Anthropic’s approach: Anthropic’s Responsible Scaling Policy (RSP) defines AI Safety Levels (ASL) modeled after biosafety level (BSL) standards. They conduct at least 10 different biorisk evaluations for each major model release. In early 2025, Anthropic sent a letter to the White House “urging immediate action on AI security after its testing revealed alarming improvements in Claude 3.7 Sonnet’s ability to assist with aspects of bioweapons development.”

OpenAI’s framework: OpenAI’s Preparedness Framework categorizes biological and chemical capabilities as “Tracked Categories” requiring ongoing evaluation. They define two thresholds:

  • High capability: Could “provide meaningful counterfactual assistance to ‘novice’ actors (anyone with a basic relevant technical background) that enables them to create known biological or chemical threats”
  • Critical capability: Could “introduce unprecedented new pathways to severe harm”

OpenAI states their most advanced models “aren’t yet capable enough to pose severe risks” in biosecurity—but expects upcoming models may reach “high” capability level.

US/UK AI Safety Institute joint evaluation (2024): The first joint government-led model evaluation tested Claude 3.5 Sonnet across biological capabilities, cyber capabilities, software development, and safeguard efficacy. Elizabeth Kelly, AISI director, called it “the most comprehensive government-led safety evaluation of an advanced AI model to date.”

MIT researcher Kevin Esvelt conducted an informal but striking demonstration. He asked students to use ChatGPT or other LLMs to create a dangerous virus. After only one hour, the class identified:

  • Four potential pandemic pathogens
  • How to generate them from synthetic DNA
  • Names of DNA synthesis companies unlikely to screen orders
  • Detailed protocols and troubleshooting guidance

As Esvelt put it regarding AI’s ability to circumvent DNA screening defenses: “We’ve built a Maginot Line of defense, and AI just walked around it.”

This demonstration, while not a rigorous study, illustrates how quickly accessible LLMs can be for malicious purposes—even for those without prior expertise.

CNAS Report: AI and Biological National Security Risks (2024)

Section titled “CNAS Report: AI and Biological National Security Risks (2024)”

The Center for a New American Security report by Bill Drexel and Caleb Withers provides a comprehensive analysis of the evolving AI-biosecurity landscape.

Key concerns identified:

  • AI could enable bioterrorism, create unprecedented superviruses, and develop novel targeted bioweapons
  • AI’s potential to “optimize bioweapons for targeted effects, such as pathogens tailored to specific genetic groups or geographies, could significantly shift states’ incentives to use biological weapons”
  • If realized, such threats could “expose the United States to catastrophic threats far exceeding the impact of COVID-19”

Key recommendations:

  • Strengthen screening mechanisms for cloud labs and genetic synthesis providers
  • Conduct rigorous assessments of foundation models’ biological capabilities throughout the bioweapons lifecycle
  • Invest in technical safety mechanisms to curb threats posed by foundation models
  • Consider a licensing regime for biological design tools with potentially catastrophic capabilities

The report emphasizes that while AI-enabled biological catastrophes are “far from inevitable,” current biological safeguards already need significant updates.

2025 has seen a significant shift in how AI labs and governments assess biological risks. Several developments stand out:

OpenAI’s High-Risk Classification (June 2025)

Section titled “OpenAI’s High-Risk Classification (June 2025)”

OpenAI Head of Safety Systems Johannes Heidecke announced that the company expects upcoming models—particularly successors to the o3 reasoning model—to trigger “high-risk classification” under its Preparedness Framework. This means they could provide “meaningful counterfactual assistance to novice actors” in creating known biological threats.

Key points from OpenAI’s approach:

  • Classified ChatGPT Agent as having “High capability in the biological domain”
  • Discovered that creating bioweapons would require weeks or months of sustained AI interaction, not single conversations
  • Implemented a traffic-light system: red-level content (direct bioweapon assistance) is immediately blocked; yellow-level content (dual-use information) requires careful handling

Anthropic became the first lab to activate its highest safety tier (ASL-3) specifically for biological concerns when releasing Claude Opus 4. Their internal evaluations found they “could no longer confidently rule out the ability of our most advanced model to uplift people with basic STEM backgrounds” attempting to develop CBRN weapons.

Anthropic’s testing revealed:

  • Participants with access to Claude Opus 4 developed bioweapon acquisition plans with “substantially fewer critical failures” than internet-only controls
  • Claude went from underperforming world-class virologists to “comfortably exceeding that baseline” on virology troubleshooting within a year
  • The company sent a letter to the White House urging immediate action after observing “alarming improvements” in Claude 3.7 Sonnet’s biological capabilities

The National Academies of Sciences, Engineering, and Medicine published “The Age of AI in the Life Sciences: Benefits and Biosecurity Considerations,” directed by Executive Order 14110. Key findings:

  • AI-enabled biological tools can improve biosecurity through enhanced surveillance and faster countermeasure development
  • Current biological design tools can design simpler structures (molecules) but cannot yet design self-replicating pathogens
  • A “distinct lack of empirical data” exists for evaluating biosecurity risks of AI-enabled biological tools
  • Recommended continued investment alongside monitoring for potential risks

The Center for Strategic and International Studies published “Opportunities to Strengthen U.S. Biosecurity from AI-Enabled Bioterrorism,” warning that current U.S. biosecurity measures are “ill-equipped to meet these challenges.” The report noted that critical safeguards in biological design tools are “already circumventable post-deployment.”

SourceFindingImplications
National Academies (2025)BDTs cannot yet design self-replicating pathogensCurrent tools limited; monitoring needed
CSIS Report (2025)Current biosecurity measures inadequatePolicy urgently needs updating
OpenAI Preparedness (2025)Next-gen models will hit “high-risk”Frontier labs anticipate near-term uplift
Anthropic ASL-3 (2025)Cannot rule out CBRN uplift for novicesFirst activation of highest safety tier
DeepSeek testing (2025)Open-source models lack equivalent safeguardsProliferation concern validated
CNAS Report (2024)AI-bio integration is emerging riskSupports compound capability concern

AI could assist at multiple stages of bioweapon development:

A successful biological attack requires success across multiple stages, each with independent failure modes:

Loading diagram...
StageAI ContributionTraditional DifficultyAI Changes What
MotivationNonePresent
Information accessHighModerateReduces search time
Knowledge upliftLow-ModerateHighBridges expertise gaps
Lab accessNoneHigh
SynthesisNone (currently)Very HighFuture: could guide procedures
DeploymentLowHighCould optimize dispersal
Evading countermeasuresModerateVariableCould design novel variants

See Bioweapons Attack Chain Model for detailed probability estimates at each stage.

Target identification — AI might help identify dangerous modifications to known pathogens or find novel biological agents. Large language models trained on scientific literature have extensive knowledge of pathogen biology.

Synthesis planning — AI could help determine how to create dangerous biological materials. Protein design tools can generate novel sequences, and LLMs can explain synthesis routes.

Knowledge bridging — Most concerningly, AI might help bridge knowledge gaps. Historically, bioweapons development required rare combinations of expertise. AI could help a motivated individual or small group compensate for missing knowledge, potentially replacing what previously required teams of specialists.

Evasion optimization — AI could help design pathogens or synthesis routes that evade detection by screening tools, surveillance systems, or medical countermeasures.


Biological threats exist on a spectrum. State programs have historically been the main concern, but the barrier to entry may be dropping. The COVID-19 pandemic demonstrated how much damage pathogens can cause and highlighted gaps in biosecurity infrastructure.

Multiple nations have maintained offensive biological weapons programs despite the Biological Weapons Convention (BWC):

ProgramEraScaleOutcome
US1943-1969LargeUnilaterally terminated by Nixon
Soviet Union1928-1992Massive (30,000-40,000 staff)Collapsed with USSR; concern about residual capabilities and scientist emigration
Japan (Unit 731)1937-1945LargeDefeated in WWII; perpetrators granted immunity by US in exchange for data
Iraq1980s-1990sModerateDismantled after Gulf War; revealed extensive program
South Africa1981-1993ModerateDismantled post-apartheid; included ethnic targeting research

These programs required vast resources, thousands of scientists, and state-level infrastructure. The concern is that AI could reduce these requirements.

Current compliance concerns: The 2024 State Department report raised BWC compliance concerns about China, Russia, Iran, and North Korea. Verification remains impossible because the BWC has no formal verification regime.

The Soviet Biopreparat Program: A Case Study

Section titled “The Soviet Biopreparat Program: A Case Study”

The Soviet Union operated the world’s largest, longest, and most sophisticated biological weapons program—in direct violation of the BWC it had signed in 1972. Understanding this program illuminates both the scale of resources historically required and the ongoing legacy concerns.

Scale and organization:

  • Biopreparat was created in April 1974 as a civilian cover organization
  • Employed 30,000-40,000 personnel across 40-50 research facilities
  • Included five major military-focused research institutes, numerous design facilities, three pilot plants, and five dual-use production plants
  • Annual production capacity for weaponized smallpox alone: 90-100 tons

Agents developed:

  • Weaponized smallpox (continued even after WHO declared eradication)
  • Anthrax (“Strain 836” created as enhanced “battle strain”)
  • Plague, Q fever, tularemia, glanders, Marburg hemorrhagic fever
  • All agents designed for aerosol dispersal via ballistic or cruise missiles

The Sverdlovsk incident (1979): Accidental release of anthrax spores from a military facility killed at least 66 people (true number unknown—KGB destroyed records). The Soviet government blamed contaminated meat until Boris Yeltsin admitted the truth in 1992.

Key defectors who revealed the program:

  • Vladimir Pasechnik (1989): First high-level defector to the UK; his testimony enabled Thatcher and Bush to pressure Gorbachev
  • Ken Alibek (Kanatjan Alibekov, 1992): First deputy director of Biopreparat; created Russia’s first tularemia bomb and enhanced anthrax strains; provided US government with detailed accounting after emigration

Legacy concerns:

  • Some facilities and scientists absorbed into public health institutions
  • US programs attempted to redirect former weapons scientists to peaceful research
  • In late 1997, US expanded efforts after detecting “intensified attempts by Iran and other countries of proliferation concern to acquire biological weapons expertise and materials from former Soviet institutes”

Lesson for AI risk: Even with massive state resources, Biopreparat required decades and thousands of scientists to develop reliable weapons. This suggests the wet-lab barrier is formidable—but also that determined state actors with existing infrastructure could integrate AI assistance more easily than non-state actors starting from scratch.

The historical record of non-state biological attacks reveals consistent technical failures despite significant motivation and resources:

1984 Oregon Salmonella Attack (Rajneeshees)

  • Religious commune deliberately contaminated salad bars with Salmonella typhimurium
  • 751 cases of food poisoning, 45 hospitalizations, no deaths
  • Remains the largest bioterrorist attack in U.S. history
  • Used readily available pathogen requiring no sophisticated technology
  • Key insight: Demonstrated that biological attacks don’t require advanced technology, but also that impact was limited without sophisticated delivery

Aum Shinrikyo (1990s)

  • Japanese cult with $1 billion in assets, hundreds of members, PhD scientists
  • Attempted anthrax, botulinum toxin, and other biological agents—all failed
  • Anthrax sprayer deployed in Tokyo produced no casualties (used vaccine strain by mistake)
  • Eventually succeeded with sarin chemical attack (13 dead, thousands injured)
  • Key insight: Even well-funded, technically sophisticated groups with scientific personnel have failed at biological weapons. The wet-lab barrier is real.

2001 Anthrax Letters (Amerithrax)

  • Letters containing anthrax spores killed 5 people, infected 17 others
  • Perpetrator (Bruce Ivins) was a senior scientist at USAMRIID with decades of anthrax experience and legitimate access to spores
  • Required no acquisition of knowledge—perpetrator was a world expert
  • Key insight: Insider threat, not information access, enabled this attack. AI wouldn’t have helped—the perpetrator already knew everything.

Why has catastrophic bioterrorism not occurred?

FactorExplanation
Technical difficultySynthesis, production, and weaponization require tacit knowledge
Pathogen handlingDangerous to the attacker; requires safety infrastructure
Delivery challengesAerosol dispersion is technically demanding
Attribution riskGenomic analysis increasingly enables source identification
Goal mismatchMost terrorist groups want publicity, not mass extinction
Limited accessDangerous pathogens are controlled; acquisition is difficult

This historical record could indicate either genuine difficulty (the barriers are high) or luck (we’ve been fortunate). The precautionary argument is that AI could systematically lower multiple barriers simultaneously, changing the calculus even if each individual barrier remains partially intact.

DNA synthesis companies already screen orders for dangerous sequences, but screening isn’t comprehensive:

Defense LayerCoverageEffectivenessAI Vulnerability
DNA synthesis screeningMajor companies40-70% (pre-2024); improvingHigh (evasion design)
BSL facility access controlHigh containmentHighLow
Pathogen inventory trackingResearch labsModerateLow
Export controls (equipment)Dual-use itemsModerateLow
Disease surveillanceAdvanced countriesModerate-HighModerate
Medical countermeasuresKnown pathogensModerateModerate (novel agents)

DNA Synthesis Screening: The Critical Chokepoint

Section titled “DNA Synthesis Screening: The Critical Chokepoint”

DNA synthesis screening is considered the key “chokepoint” in the AI-assisted bioweapons pipeline—if dangerous sequences can be intercepted before synthesis, attacks become much harder. However, significant gaps remain:

Current limitations:

  • Participation in the International Gene Synthesis Consortium (IGSC) is voluntary—not all companies are members
  • Regulations are inconsistent between countries
  • Screening relies on matching against databases of known dangerous sequences—novel variants can evade detection
  • High false positive rates require expensive human review
  • Benchtop DNA synthesizers are emerging that could bypass commercial screening entirely

Post-Microsoft patch status: After Microsoft’s research revealed 75%+ evasion rates, a software patch was deployed to synthesis companies worldwide. The fix now catches approximately 97% of threats—but experts warn “the fix is incomplete” and gaps remain.

Policy response: In April 2024, the White House OSTP released a Framework for Nucleic Acid Synthesis Screening, requiring federally funded programs to screen customers and orders, keep records, and report suspicious orders. NIST is partnering with stakeholders to improve screening standards and mitigate AI-specific risks.

SecureDNA: A Swiss foundation providing free, privacy-preserving DNA synthesis screening that already exceeds 2026 regulatory requirements. SecureDNA screens sequences below the 50 base pair length using a “random adversarial threshold” algorithm designed to be more robust against AI-designed evasion.

Nucleic Acid Observatory (NAO): A collaboration between SecureBio and MIT pioneering pathogen-agnostic early warning through deep metagenomic sequencing. Unlike traditional surveillance that looks for known pathogens, NAO aims to detect new and unknown pathogens through wastewater and pooled nasal swab sampling.

SecureBio’s “Delay, Detect, Defend” strategy: Kevin Esvelt’s SecureBio organization works on multiple defensive layers:

  • Delay: Synthesis screening and access controls
  • Detect: Early warning systems like the NAO
  • Defend: Societal resilience through germicidal UV light, pandemic-proof PPE stockpiles, and rapid countermeasure development

Several emerging technologies could compound AI-enabled biosecurity risks by removing barriers that currently limit attack feasibility:

A new generation of desktop DNA synthesis devices may enable users to print DNA in their own laboratories, potentially bypassing commercial screening entirely.

Current products:

  • Kilobaser: Personal DNA/RNA synthesizer, 27x33x33 cm, produces oligos in 30-50 minutes with 2.5 min/base turnaround
  • DNA Script SYNTAX System: Enzymatic DNA synthesis (water-based, avoiding harsh chemicals), 96 parallel oligos up to 120 nucleotides
  • Evonetix Evaleo: Gene-length DNA synthesis on silicon chips, claiming 10x faster than current technologies
  • BioXp (Telesis Bio): Commercial benchtop synthetic biology workstation automating pipetting, mixing, thermal cycling, purification, and storage

Current limitations:

  • Most benchtop devices limited to sequences under 120 base pairs—insufficient for most dangerous applications
  • Not yet viable alternatives to centralized DNA providers for gene-length sequences
  • Quality control and yield often inferior to commercial synthesis

Biosecurity implications:

  • NTI analysis notes “three converging technological trends—enzymatic synthesis, hardware automation, and increased demand from computational tools—are likely to drive rapid advancement in benchtop capabilities over the next decade”
  • Manufacturers should implement rigorous sequence screening for each fragment produced
  • Governments should provide clear regulations for manufacturers to incorporate screening
  • Once capabilities exceed current limits, benchtop devices could become a significant biosecurity gap

Cloud laboratories are heavily automated, centralized research facilities where scientists run experiments remotely from computers. They present unique biosecurity challenges:

How cloud labs lower barriers:

  • Reduce technical skill requirements by automating complex procedures
  • Enable “one-stop-shop” research that could expand the pool of capable actors
  • Allow experiments to be performed remotely, potentially bypassing ethical constraints in traditional academic settings
  • Researchers retain full control over experimental design without physical presence

Current governance gaps:

  • No public data on cloud lab operations, workflows, customer numbers, or locations worldwide
  • No standardized approaches for customer screening shared between organizations
  • Cybersecurity laws don’t account for unique vulnerabilities of biological data and lab automation systems
  • Biosafety regulations typically neglect digital threats like remote manipulation of synthesis machines

Proposed solutions (RAND):

  • Create a Cloud Lab Security Consortium (CLSC) similar to IGSC for DNA synthesis
  • Minimum security standards: customer screening, controlled substance access, experiment screening, secured networks
  • Human-in-the-loop controls when AI systems place synthesis orders for sequences of concern

Beyond LLMs, specialized biological design tools present distinct risks:

AlphaFold and protein structure prediction:

  • Revolutionary tool for predicting protein structure from genetic sequence (90%+ accuracy)
  • Could enable optimization of existing hazards: increasing toxicity, improving immune evasion, enhancing transmissibility
  • Could potentially enable design of completely novel toxins targeting human proteins
  • Google DeepMind engaged 50+ domain experts in biosecurity assessment for AlphaFold 3
  • Implements experimental refusal mechanisms to block misuse—but biological design often resides in dual-use space

Other BDT concerns:

  • Machine learning for prediction of host range, transmissibility, and virulence
  • Generative models for novel agent design
  • Tools that help design sequences evading DNA screening (as demonstrated in Microsoft research)

Dual-use nature: Unlike LLM guardrails, where harmful requests are often clearly distinguishable, biological design tool queries are frequently dual-use. The same protein optimization that could enhance a therapeutic could theoretically enhance a toxin. This makes technical controls more difficult than for text-based LLMs.

Policy recommendations (UNICRI):

  • Prerelease evaluation requirements for advanced biological models regardless of funding source
  • Prioritize mitigating risks of pathogens capable of causing major epidemics
  • Preserve researcher autonomy while implementing targeted controls on highest-risk capabilities

AI-enabled bioweapons risk exists within a broader context of biosecurity challenges, including ongoing debates about research oversight and international governance gaps.

Gain-of-Function and Enhanced Pandemic Pathogen Research

Section titled “Gain-of-Function and Enhanced Pandemic Pathogen Research”

Gain-of-function (GoF) research—experiments that enhance pathogen transmissibility, virulence, or host range—has become intensely controversial, with implications for AI-biosecurity debates:

Recent policy developments:

Congressional activity:

  • House approved a ban on federal funding for GoF research modifying risky pathogens
  • Scientific groups warn vaguely worded provisions could unintentionally halt flu vaccine development and other beneficial research
  • Risky Research Review Act (S. 854, H.R. 1864) would establish a life sciences research security board

Key limitation: Both the 2014 DURC Policy and 2024 PEPP Policy only apply to government-funded research. Extending coverage to privately funded research would require new regulations or legislation. AI labs developing biological design tools with private funding currently face no equivalent oversight requirements.

Relevance to AI risk: The GoF debate previews challenges AI governance will face:

  • Distinguishing beneficial from dangerous research is difficult
  • Oversight mechanisms are primarily voluntary and apply only to government-funded work
  • International coordination is lacking
  • Technical definitions (“gain of function,” “enhanced pandemic potential”) are contested

The Biological Weapons Convention: Structural Weaknesses

Section titled “The Biological Weapons Convention: Structural Weaknesses”

The Biological Weapons Convention (BWC), signed in 1972, prohibits development, production, and stockpiling of biological weapons. It has 187 states parties—but significant structural weaknesses:

No verification regime:

  • Unlike chemical and nuclear weapons agreements, the BWC contains no formal verification provisions
  • Attempts to develop a verification protocol failed in 2001 after years of negotiation
  • Governments have not discussed verification within the treaty framework for over two decades

Minimal institutional support:

  • The BWC has only four staff members
  • Budget is smaller than an average McDonald’s restaurant (per Toby Ord)
  • Compare to: IAEA has 2,500+ staff; OPCW has 500+ staff

Recent developments:

  • December 2022: States Parties established a Working Group on strengthening the Convention
  • 2024: Fourth and fifth Working Group sessions held (August, December)
  • December 2024: Fifth session “ended with a regrettable conclusion in which a single States Party undermined the noteworthy progress achieved”—setback reported by Council on Strategic Risks
  • Working Group has only seven days through end of 2025 for verification discussion

Practical limitations:

  • No politically palatable, technologically feasible, and financially sustainable system can guarantee detection of all biological weapons
  • Rapid advances in biotechnology create new verification challenges
  • AI capabilities could make verification even more difficult by enabling novel agent design

What’s possible: While perfect verification is unachievable, the Bulletin of the Atomic Scientists argues that “measures in combination could generate considerably greater confidence in compliance by BWC states parties.”


Defensive Technologies and Pandemic Preparedness

Section titled “Defensive Technologies and Pandemic Preparedness”

The same technological advances that could enable attacks also offer powerful defensive capabilities. Many experts believe defense will ultimately win the offense-defense balance—the question is whether we’re in a dangerous transition period.

The COVID-19 pandemic demonstrated the transformative potential of mRNA vaccines for rapid response:

Speed advantages:

  • Traditional vaccines require time-consuming manufacturing with live pathogens
  • mRNA vaccines can be designed in days once genetic sequence is known
  • COVID-19 mRNA vaccines received FDA EUA in under one year—unprecedented speed
  • CEPI’s “100 Days Mission” aims to develop safe, effective vaccines against new threats in just 100 days

Manufacturing advantages:

  • Cell-free manufacture enables accelerated, scalable production
  • Standardizable processes require minimal facility adaptations between products
  • Smaller manufacturing footprints than traditional vaccines
  • Same facility can produce multiple vaccine products

Safety profile:

  • mRNA does not enter cell nucleus—cannot integrate into cellular genome
  • Can be administered repeatedly (no anti-vector immunity like with viral vectors)
  • Avoids live pathogen handling in manufacturing

Pandemic preparedness implications:

  • Platform is “pathogen-agnostic”—same technology works against any target with known sequence
  • BARDA and CEPI supporting development of 50+ vaccine candidates against high-risk pathogens
  • Next-generation “trans-amplifying” mRNA vaccines under development could provide stronger immune responses

For AI-bioweapons specifically: Rapid vaccine development could limit the damage from engineered pathogens if they’re detected early. However, novel agents designed to evade detection or existing countermeasures would still pose severe risks during the response window.

Traditional disease surveillance looks for known pathogens. Metagenomic sequencing offers pathogen-agnostic detection:

How it works:

  • Deep sequencing of all genetic material in samples (wastewater, nasal swabs, etc.)
  • Computational analysis identifies viral, bacterial, and other sequences
  • Can detect novel or unexpected pathogens that wouldn’t be caught by targeted testing

Current research:

  • Nucleic Acid Observatory: Sequencing wastewater from major US airports and treatment plants
  • Recent dataset: 13.1 terabases from 20 wastewater samples at LA Hyperion plant (serving 4 million residents)
  • Lancet Microbe publication establishing sensitivity models for W-MGS detection

Sensitivity and cost tradeoffs:

  • Untargeted shotgun sequencing less sensitive than targeted methods for known pathogens
  • Hybridization capture panels greatly increase sensitivity for viruses in the panel but may reduce sensitivity to unknown pathogens
  • Large variation in viral detection based on sewershed hydrology and laboratory protocols
  • Sensitivity of 1 infected person among 257-2,250 for certain bacterial pathogens

For AI-bioweapons specifically: Metagenomic surveillance could provide early warning for engineered pathogens that evade targeted detection. However, sensitivity limits mean outbreaks may need to reach significant scale before detection occurs.

Far-UVC (200-235 nm wavelength) is emerging as a potentially transformative technology for airborne pathogen inactivation in occupied spaces:

Why it’s different from conventional UV:

  • Conventional germicidal UV-C (254 nm) harms human skin and eyes—limited to upper-room use or unoccupied spaces
  • Far-UVC (typically 222 nm) is absorbed in the outer dead layer of skin and tear layer of eyes—cannot penetrate to living tissue
  • Enables direct disinfection of breathing zone while people are present

Efficacy:

  • Very low dose (2 mJ/cm²) of 222-nm light inactivates >95% of airborne H1N1 virus
  • Single far-UVC fixture delivers 33-66 equivalent air changes per hour for pathogen removal
  • Tested effective against tuberculosis, SARS-CoV-2, influenza, murine norovirus (99.8% reduction)
  • 2025 review: “high ability” to kill pathogens with “high level of safety”

Applications for pandemic preparedness:

  • Installation in hospitals, schools, airports, public transit could dramatically reduce airborne transmission
  • Blueprint Biosecurity funding research teams to evaluate deployment in real-world scenarios
  • Open Philanthropy issued RFI on far-UVC evaluation
  • NIST collaborating with industry on standards development

Remaining questions:

  • Long-term exposure effects require further research
  • Real-world efficacy in varied building environments not fully characterized
  • Cost and feasibility of widespread deployment

For AI-bioweapons specifically: Far-UVC could provide a layer of defense against aerosol-dispersed biological agents in public spaces. Even if attackers successfully synthesize and deploy pathogens, widespread far-UVC installation could limit transmission and buy time for medical response.


Refusals and filtering — Training models not to help with bioweapon development and filtering dangerous outputs. But these are imperfect—models can be jailbroken, fine-tuned, or open-source models may lack restrictions entirely.

Effectiveness assessment:

  • Reduces casual misuse
  • Raises barrier for unsophisticated actors
  • Does not prevent determined actors with technical skills
  • Cannot address open-source model proliferation

Evaluations before deployment — Testing models for biosecurity risks during development, as part of responsible scaling policies. Useful but relies on labs’ good faith and competence.

Compute governance — Limiting who can train powerful models reduces the availability of capable models to bad actors. Information security around model weights becomes important if models can provide meaningful uplift.

Biological capability thresholds — Anthropic’s RSP and similar frameworks establish biological capability as a key threshold for enhanced safety measures. This creates systematic evaluation requirements.

Open-source restrictions — Limiting the release of model weights for systems with significant biological knowledge. Controversial due to benefits of open research.

Broader biosecurity measures may matter more than AI-specific interventions:

InterventionCostRisk ReductionPriority
DNA synthesis screening~$100M/year5-15%High
Metagenomic surveillance~$500M/year15-25%Very High
BSL facility security~$200M/year5-10%High
Pandemic response stockpiles~$2B/year10-20%Medium-High
International verification~$300M/year3-8%Medium

DNA synthesis screening — Flagging dangerous sequences before synthesis. The primary defense but has significant gaps that AI can exploit.

Laboratory access controls — Restricting who can work with dangerous pathogens. Effective for legitimate facilities; doesn’t address improvised labs.

Disease surveillance — Early detection of outbreaks. Benefits from AI advances and may be where AI provides greatest defensive value.

Medical countermeasures — Rapid vaccine and treatment development. mRNA platforms demonstrated during COVID-19 show how quickly responses can be developed.


DateEvent
1972Biological Weapons Convention signed (now 187 states parties)
1984Rajneeshee salmonella attack—751 casualties, largest US bioterrorist attack
1995Aum Shinrikyo attempts bioweapons (anthrax, botulinum), fails; uses sarin instead
2001Anthrax letters kill 5, infect 17; perpetrator was an insider with legitimate access
2020Toby Ord publishes The Precipice, estimating 1/30 existential risk from engineered pandemics
2020-21COVID-19 demonstrates pandemic potential; exposes biosecurity gaps
2022Collaborations Bio shows AI can design novel protein toxins in hours
2023 (July)Dario Amodei warns of “substantial risk” AI will enable bioattacks within 2-3 years
2023 (Nov)Gryphon Scientific red-team finds Claude provides “post-doc level” biological knowledge
2024 (Jan)RAND red-team study finds no significant AI uplift for bioweapon planning
2024 (Apr)White House OSTP releases Framework for Nucleic Acid Synthesis Screening
2024 (May)Microsoft research reveals 75%+ of AI-designed toxins evade DNA screening
2024 (Aug)CNAS publishes report on AI and biological national security risks
2024 (Aug)US AI Safety Institute signs agreements with Anthropic and OpenAI for biosecurity evaluation
2024 (Oct)Executive Order 14110 directs National Academies to study AI biosecurity
2024 (Nov)US/UK AI Safety Institutes conduct first joint model evaluation (Claude 3.5 Sonnet)
2024 (Dec)Anthropic RSP includes 10+ biological capability evaluations per model
2025 (Jan)Anthropic sends letter to White House citing “alarming improvements” in Claude 3.7 Sonnet
2025 (Feb)Anthropic CEO reports DeepSeek was “the worst” model tested for biosecurity safeguards
2025 (Mar)National Academies publishes “The Age of AI in the Life Sciences” report
2025 (Apr)OpenAI’s o3 model ranks 94th percentile among expert virologists on capability test
2025 (May)Anthropic activates ASL-3 protections for Claude Opus 4 due to CBRN concerns
2025 (Jun)OpenAI announces next-gen models will hit “high-risk” biological classification
2025 (Jul)OpenAI hosts biodefense summit with government researchers and NGOs
2025 (Jul)Trump administration’s AI Action Plan identifies biosecurity as priority
2025 (Aug)CSIS publishes “Opportunities to Strengthen U.S. Biosecurity from AI-Enabled Bioterrorism”
2025 (Oct)Microsoft publishes Science paper; screening patch deployed globally (97% effective)

Expert opinion on AI-bioweapons risk is divided, with prominent voices on both sides:

Kevin Esvelt (MIT): One of the most vocal experts on AI-biosecurity risks. Esvelt emphasizes that if you ask a chatbot how to cause a pandemic, “it will suggest the 1918 influenza virus. It will even tell you where to find the gene sequences online and where to purchase the genetic components.” He co-founded SecureDNA and SecureBio to address these risks.

Dario Amodei (Anthropic CEO): In July 2023, stated there was a “substantial risk” that within 2-3 years, AI would “greatly widen the range of actors with the technical capability to conduct a large-scale biological attack.” In February 2025, reported that DeepSeek was “the worst” model tested for biosecurity, generating information “that can’t be found on Google or easily found in textbooks.”

Johannes Heidecke (OpenAI Head of Safety Systems): In June 2025, announced OpenAI expects upcoming models to hit “high-risk classification” for biological capabilities. Emphasized that “99% or even one in 100,000 performance is [not] sufficient” for testing accuracy.

Rocco Casagrande (Gryphon Scientific): After red-teaming Claude, said he was “personally surprised and dismayed by how capable current LLMs were” and that “these things are developing extremely, extremely fast.”

Toby Ord (Oxford): Estimates engineered pandemic risk at 1 in 30 by 2100—second highest anthropogenic existential risk after AI itself.

Georgia Adamson and Gregory C. Allen (CSIS): Their August 2025 report warns current U.S. biosecurity measures are “ill-equipped” to meet AI-enabled challenges, with BDT safeguards “already circumventable post-deployment.”

Bill Drexel and Caleb Withers (CNAS): Their August 2024 report warns AI could enable “catastrophic threats far exceeding the impact of COVID-19.”

RAND researchers (Mouton, Lucas, Guest): Their 2024 study found “no statistically significant difference” between AI-assisted and non-AI groups in bioweapon planning capability. This is the strongest empirical evidence against immediate AI uplift concerns.

Some biosecurity practitioners: Emphasize that the wet lab bottleneck—tacit knowledge, equipment access, technique—remains the primary barrier, and AI cannot transfer hands-on skills.

Information abundance argument: Dangerous information is already accessible through scientific literature and the internet. AI may provide convenience but not fundamentally new capabilities.

The debate often reduces to different assessments of:

QuestionHigher Concern ViewLower Concern View
Current uplift2025 lab evaluations show expert-level capabilitiesRAND 2024 study is most rigorous empirical evidence
Future trajectoryOpenAI/Anthropic expect “high-risk” soonMay plateau; defenses improving
Key bottleneckKnowledge gap narrowing fastWet lab skills remain rate-limiting
Guardrail effectivenessDeepSeek shows open-source gapsFrontier labs implementing robust safeguards
Risk toleranceASL-3 activation signals real concernBase rates suggest low probability

2025 shift: The debate has evolved significantly. Both major frontier labs now officially acknowledge their next-generation models pose elevated biological risks. The question is shifting from “does AI provide uplift?” to “how much uplift, and can mitigations keep pace?”

Notably: Even those who downplay current uplift often acknowledge that future models may pose greater risks, and that defensive investments are worthwhile regardless.



Analytical Models

The following analytical models provide structured frameworks for understanding this risk:

ModelTypeNovRigActCmp
Bioweapons Attack Chain Model

This model decomposes bioweapons attacks into seven sequential steps with independent failure modes. DNA synthesis screening offers 5-15% risk reduction for $7-20M, with estimates carrying 2-5x uncertainty at each step.

Probability Decomposition3454
AI Uplift Assessment Model

This model estimates AI's marginal contribution to bioweapons risk over time. It projects uplift increasing from 1.3-2.5x (2024) to 3-5x by 2030, with biosecurity evasion capabilities posing the greatest concern as they could undermine existing defenses before triggering policy response.

Comparative Analysis4445
AI-Bioweapons Timeline Model

This model projects when AI crosses capability thresholds for bioweapons. It estimates knowledge democratization is already crossed, synthesis assistance arrives 2027-2032, and novel agent design by 2030-2040.

Timeline Projection4454

Bioweapons risk affects the Ai Transition Model primarily through Misuse Potential:

ParameterImpact
Biological Threat ExposureDirect parameter—AI uplift for bioweapon development
AI Control ConcentrationPowerful AI in few hands increases misuse risk

The bioweapons pathway can lead to Human-Caused Catastrophe—catastrophic outcomes from humans misusing AI capabilities, distinct from AI misalignment.