AI Safety Institutes
AI Safety Institutes (AISIs)
Quick Assessment
Section titled “Quick Assessment”| Dimension | Assessment | Evidence |
|---|---|---|
| Tractability | Medium | UK AISI grew from 0 to 100+ staff in 18 months; US AISI reached 280+ consortium members |
| Effectiveness | Uncertain | Completed joint pre-deployment evaluations of Claude 3.5 Sonnet and GPT o1, but advisory-only authority limits impact |
| Scale Match | Low | Institutes have dozens-to-hundreds of staff vs. thousands at frontier labs; $10M-$66M budgets vs. billions in lab investment |
| Independence | Medium-Low | Voluntary access agreements create dependency; regulatory capture concerns documented in academic literature |
| International Coordination | Growing | 11-nation network established May 2024; first San Francisco meeting November 2024 |
| Political Durability | Uncertain | UK renamed to “AI Security Institute” (Feb 2025); US renamed to “Center for AI Standards and Innovation” (June 2025) |
| Timeline Relevance | Moderate | Evaluation cycles of weeks-to-months may lag deployment decisions as AI development accelerates |
Overview
Section titled “Overview”AI Safety Institutes (AISIs) represent a fundamental shift in how governments approach AI oversight, establishing dedicated technical institutions to evaluate advanced AI systems, conduct safety research, and inform policy decisions. These government-affiliated organizations emerged as a response to the widening gap between rapidly advancing AI capabilities and regulatory capacity, aiming to build in-house technical expertise that can meaningfully assess frontier AI systems.
The AISI model gained momentum following the November 2023 Bletchley Park AI Safety Summit, where the UK announced the first major institute. Within months, the United States established its own institute, followed by Japan and Singapore, with over a dozen additional countries announcing plans or expressing interest. This rapid international adoption reflects a growing consensus that traditional regulatory approaches are inadequate for governing transformative AI technologies.
At their core, AISIs address a critical information asymmetry problem. AI labs possess deep technical knowledge about their systems’ capabilities and limitations, while government regulators often lack the specialized expertise to independently assess these claims. AISIs attempt to bridge this gap by recruiting top AI talent, securing pre-deployment access to frontier models, and developing rigorous evaluation methodologies. However, their effectiveness remains constrained by structural limitations around independence, enforcement authority, and resource constraints relative to the labs they oversee.
Why They Exist
Section titled “Why They Exist”Traditional regulatory frameworks face fundamental challenges when applied to advanced AI systems. Regulatory agencies typically rely on industry self-reporting, external consultants, or academic research to understand new technologies. For AI, this approach proves inadequate due to several factors: the extreme technical complexity of modern AI systems requires deep machine learning expertise to properly evaluate; capabilities evolve on timescales of months rather than years, far faster than traditional policy development cycles; meaningful safety assessment requires direct access to model weights, training processes, and internal evaluations that labs consider proprietary; and the potential risks from advanced AI systems—from bioweapons assistance to autonomous cyber operations—demand urgent, technically-informed oversight.
AISIs emerged as an institutional innovation designed to address these challenges. By housing technical experts within government structures, they aim to develop independent evaluation capabilities, establish ongoing relationships with AI labs to secure model access, create standardized methodologies for assessing AI risks and capabilities, and translate technical findings into policy recommendations that can inform regulatory decisions.
The model reflects lessons learned from other high-stakes technical domains. Nuclear safety regulation succeeded partly because agencies like the Nuclear Regulatory Commission developed deep in-house technical expertise. Similarly, financial regulation became more effective when agencies hired quantitative experts who could understand complex derivatives and trading strategies. AISIs represent an attempt to apply this pattern to AI governance.
Current Assessment
Section titled “Current Assessment”AISIs show significant promise as governance infrastructure but face critical limitations that may constrain their long-term effectiveness. On the positive side, they have demonstrated rapid institutional development, with the UK institute growing from concept to 50+ staff within a year. They have secured meaningful access to frontier models from major labs including OpenAI, Anthropic, Google DeepMind, and Meta—a significant achievement given these companies’ general reluctance to share proprietary information. The institutes have begun developing sophisticated evaluation frameworks and have established international coordination mechanisms that could scale globally.
However, several structural challenges raise questions about their ultimate impact. Most AISIs operate in advisory roles without enforcement authority, making their influence dependent on voluntary industry cooperation rather than regulatory power. They remain dramatically smaller than the labs they oversee, with dozens of staff evaluating systems developed by teams of thousands. Their independence faces pressure from both industry relationships and political oversight, potentially compromising their ability to deliver critical assessments. Perhaps most fundamentally, the timeline mismatch between evaluation cycles and deployment decisions may render their work strategically irrelevant if labs continue to advance capabilities faster than evaluators can assess them.
Risks Addressed
Section titled “Risks Addressed”| Risk Category | How AISIs Address It | Mechanism | Effectiveness |
|---|---|---|---|
| Bioweapons | Pre-deployment evaluation of biological knowledge capabilities | Testing for synthesis planning, pathogen design assistance | Medium - evaluations completed but advisory-only |
| Cyberweapons | Testing for offensive cyber capabilities | Vulnerability discovery and exploitation assessment | Medium - TRAINS taskforce focuses on national security |
| Racing dynamics | Providing independent capability assessment | Creates incentive for labs to demonstrate safety | Low - no enforcement to slow deployment |
| Deceptive alignment | Safeguard efficacy testing | Red-teaming for jailbreaks and refusal consistency | Uncertain - detection methods still developing |
| Misuse by malicious actors | Informing policy on model access controls | Capability evaluation informs release decisions | Medium - depends on lab cooperation |
The Global Landscape
Section titled “The Global Landscape”Institute Comparison
Section titled “Institute Comparison”| Institute | Est. Date | Staff Size | Annual Budget | Key Focus | Pre-deployment Access |
|---|---|---|---|---|---|
| UK AISI↗ | Nov 2023 | 100+ technical staff | $66M (plus $1.5B compute access) | Model evaluation, Inspect framework | OpenAI, Anthropic, Google DeepMind, Meta |
| US AISI↗ | Feb 2024 | 280+ consortium members | $10M initial | Standards, national security testing | OpenAI, Anthropic (MOUs signed Aug 2024) |
| Japan AISI↗ | Feb 2024 | Cross-agency structure | Undisclosed | Evaluation methodology | Coordination with NIST |
| Singapore | Planned 2024 | TBD | TBD | Southeast Asia coordination | TBD |
| EU/France/Germany | In development | TBD | TBD | EU-wide coordination | TBD |
United Kingdom AI Safety Institute
Section titled “United Kingdom AI Safety Institute”The most developed AISI globally, with 100+ staff and pre-deployment access to major frontier models. See UK AI Safety Institute for full details.
United States AI Safety Institute
Section titled “United States AI Safety Institute”NIST-based institute with 280+ consortium members and MOUs with OpenAI and Anthropic. See US AI Safety Institute for full details.
International Network Development
Section titled “International Network Development”Beyond the UK and US institutes, the AISI model is spreading internationally. Japan established its AI Safety Institute↗ in February 2024 as a cross-government effort involving the Cabinet Office, Ministry of Economy Trade and Industry, and multiple research institutions, with Director Akiko Murakami leading evaluation methodology development. Singapore announced plans for its own institute to serve as a hub for AI development in Southeast Asia.
At the May 2024 Seoul AI Safety Summit↗, world leaders from Australia, Canada, the EU, France, Germany, Italy, Japan, Korea, Singapore, the UK, and the US signed the Seoul Statement of Intent↗, establishing the International Network of AI Safety Institutes. U.S. Secretary of Commerce Gina Raimondo formally launched the network, which aims to “accelerate the advancement of the science of AI safety” through coordinated research, resource sharing, and codeveloping AI model evaluations.
The network held its first in-person meeting↗ on November 20-21, 2024 in San Francisco, bringing together technical AI experts from nine countries and the European Union. Participating institutes agreed to pursue complementarity and interoperability, develop best practices, and exchange evaluation methodologies.
However, international coordination faces significant challenges. Different countries have varying national security concerns, regulatory approaches, and relationships with AI labs. The CSIS analysis↗ notes that the network “remains heavily weighted toward higher-income countries in the West, limiting its impact.” Information sharing is constrained by classification requirements and competitive concerns, and the effectiveness of coordination depends on sustained political commitment that may be vulnerable to leadership changes (as seen in US rebranding).
Operational Methodology
Section titled “Operational Methodology”Evaluation Frameworks
Section titled “Evaluation Frameworks”AISIs have developed methodologies for evaluating AI systems across multiple dimensions of safety and capability. The joint UK-US evaluation of Claude 3.5 Sonnet and OpenAI o1 tested models across four domains, providing a template for pre-deployment assessment:
| Evaluation Domain | What It Tests | Key Benchmarks Used | Findings from Joint Evaluations |
|---|---|---|---|
| Biological capabilities | Assistance with pathogen design, synthesis planning | Custom biosecurity scenarios | Models compared against reference baselines |
| Cyber capabilities | Offensive security assistance, vulnerability exploitation | HarmBench↗ framework | Tested autonomous operation in security contexts |
| Software/AI development | Autonomous coding, recursive improvement potential | Agentic coding tasks | Assessed scaffolding and tool use capabilities |
| Safeguard efficacy | Jailbreak resistance, refusal consistency | Red-teaming with diverse prompts | Measured safeguard robustness across attack vectors |
The 2024 FLI AI Safety Index↗ convened seven independent experts to evaluate six leading AI companies. The review found that “although there is a lot of activity at AI companies that goes under the heading of ‘safety,’ it is not yet very effective.” Anthropic received recognition for allowing third-party pre-deployment evaluations by the UK and US AI Safety Institutes, setting a benchmark for industry best practices.
Key benchmarks developed for dangerous capability assessment include the Weapons of Mass Destruction Proxy Benchmark (WMDP)↗, a dataset of 3,668 multiple-choice questions measuring hazardous knowledge in biosecurity, cybersecurity, and chemical security. Stanford’s AIR-Bench 2024 provides 5,694 tests spanning 314 granular risk categories aligned with government regulations.
Capability assessment presents particular challenges because it requires evaluators to anticipate potentially novel abilities before they manifest. The FLI analysis↗ notes that “naive elicitation strategies cause significant underreporting of risk profiles, potentially missing dangerous capabilities that sophisticated actors could unlock.” State-of-the-art elicitation techniques—adapting test-time compute, scaffolding, tools, and fine-tuning—are essential but resource-intensive.
Technical Infrastructure
Section titled “Technical Infrastructure”The development of standardized evaluation tools represents a crucial aspect of AISI work. The UK institute’s Inspect framework exemplifies this approach, providing a modular system that supports multiple model APIs, enables reproducible evaluation protocols, facilitates comparison across different models and time periods, and allows community contribution to evaluation development.
These technical infrastructures must balance several competing requirements. They need sufficient sophistication to detect subtle but dangerous capabilities while remaining accessible to researchers without specialized infrastructure. They must provide consistent results across different computing environments while adapting to rapidly evolving model architectures and capabilities.
The open-source approach adopted by several institutes reflects a strategic decision that community development can advance evaluation capabilities faster than any single institution. However, this openness also means that AI labs can optimize their systems against known evaluation methodologies, potentially undermining the validity of assessments.
Access Negotiations
Section titled “Access Negotiations”Securing meaningful access to frontier AI systems represents perhaps the most critical and challenging aspect of AISI operations. Labs are understandably reluctant to share proprietary information about their most advanced systems, both for competitive reasons and because such information could enable competitors or malicious actors to develop similar capabilities.
Successful access negotiations typically involve careful balance of several factors: providing labs with valuable feedback or evaluation services in exchange for access, establishing clear confidentiality protocols that protect proprietary information, demonstrating technical competence and responsible handling of sensitive information, and maintaining relationships that incentivize continued cooperation rather than treating labs as adversaries.
The voluntary nature of current access agreements represents both an opportunity and a fundamental limitation. Labs cooperate because they perceive value in independent evaluation or because they want to maintain positive relationships with government institutions. However, this voluntary approach means that access could be withdrawn if labs conclude that cooperation is no longer in their interest.
Critical Limitations and Challenges
Section titled “Critical Limitations and Challenges”The Independence Dilemma
Section titled “The Independence Dilemma”AISIs face an inherent tension between the need for industry cooperation and the requirement for independent oversight. A 2025 analysis in AI & Society↗ warns that “the field of AI safety is extremely vulnerable to regulatory capture” and that “those who advocate for regulation as a response to AI risks may be inadvertently playing into the hands of the dominant firms in the industry.”
The TechPolicy.Press analysis↗ notes a major set of concerns “has to do with their relationship to industry, particularly around fears that close ties with companies might lead to ‘regulatory capture,’ undermining the impartiality and independence of these institutes.” This is particularly challenging because AISIs need good relationships with AI companies to access and evaluate models in the first place.
Industry influence can manifest through several channels:
| Capture Mechanism | How It Operates | Observed Examples |
|---|---|---|
| Hiring patterns | Staff recruited from labs bring industry perspectives | UK/US AISI leadership includes former lab employees |
| Access dependencies | Voluntary model access creates incentive to avoid critical findings | All major access agreements remain voluntary |
| Funding relationships | Resource-sharing arrangements create dependencies | UK AISI receives compute access from industry partners |
| Framing adoption | Institutes adopt industry definitions of “safety” | Focus on capability evaluation vs. broader harms |
| Revolving door | Staff may return to industry after government service | Career incentives favor positive industry relations |
The OECD analysis↗ recommends that the AISI Network “preserve its independent integrity by operating as a community of technical experts rather than regulators.” However, this advisory positioning may limit impact when enforcement is needed.
Authority and Enforcement Gaps
Section titled “Authority and Enforcement Gaps”Most existing AISIs operate in advisory roles without direct enforcement authority. They can evaluate AI systems and publish findings, but they cannot compel labs to provide access, delay deployments pending evaluation, or enforce remediation of identified safety issues. This limitation fundamentally constrains their potential impact on AI development trajectories.
The advisory model has several advantages: it allows AISIs to build relationships and credibility before seeking expanded authority, it avoids regulatory capture concerns that might arise with enforcement powers, it enables international coordination without requiring harmonized legal frameworks, and it provides flexibility to adapt approaches as the technology and risk landscape evolves.
However, advisory authority may prove inadequate as AI capabilities advance. If AISIs identify serious safety concerns but cannot compel action, their evaluations become merely informational rather than protective. Labs facing competitive pressure may ignore advisory recommendations, particularly if compliance would significantly delay deployment or increase costs relative to competitors.
The path from advisory to regulatory authority faces significant challenges. Expanding AISI powers requires legislative action in most jurisdictions, which involves complex political processes and industry lobbying. Different countries may develop incompatible regulatory approaches, fragmenting the international coordination that makes AISIs potentially valuable. Most fundamentally, effective enforcement requires technical standards and evaluation methodologies that remain under development.
Scale and Resource Constraints
Section titled “Scale and Resource Constraints”The resource mismatch between AISIs and the AI labs they oversee represents a fundamental challenge to effective evaluation. Leading AI labs employ thousands of researchers and engineers and spend billions of dollars annually on AI development. Even the largest planned AISIs will have hundreds of staff members and budgets measured in tens or hundreds of millions.
This scale disparity manifests in several ways that limit AISI effectiveness. AISIs cannot match lab investment in evaluation infrastructure, potentially missing sophisticated safety issues that require extensive computational resources to detect. They must rely on lab cooperation for access to training data, model architectures, and internal evaluations, rather than independently verifying such information. They lack the personnel to comprehensively evaluate the full range of capabilities that emerge from large-scale training, potentially missing important but rare abilities.
Perhaps most critically, AISIs may always be evaluating last generation’s technology while labs deploy current generation systems. If evaluation cycles take months while development cycles take weeks, AISI findings become historically interesting but strategically irrelevant. This timing mismatch could worsen as AI development accelerates and evaluation methodologies become more sophisticated and time-consuming.
Addressing scale limitations may require fundamental changes to the current model. Potential approaches include mandatory disclosure requirements that shift evaluation burden to labs, international cost-sharing that pools resources across multiple institutes, public-private partnerships that leverage industry evaluation infrastructure, or regulatory approaches that slow deployment timelines to match evaluation capabilities.
Methodological Uncertainties
Section titled “Methodological Uncertainties”AI evaluation faces profound technical challenges that limit the reliability and relevance of current methodologies. The problem of unknown capabilities—abilities that emerge unexpectedly from large-scale training—means that evaluations may miss the most important and dangerous capabilities. Current evaluation approaches focus on testing known capability categories, but transformative AI systems may develop qualitatively new abilities that existing frameworks cannot detect.
Evaluation validity represents another fundamental challenge. Laboratory testing may not predict real-world behavior, particularly for systems that adapt their responses based on context or user interactions. Safety properties demonstrated during evaluation may not persist across different deployment scenarios, user populations, or adversarial contexts.
The arms race dynamic between evaluation and optimization presents an ongoing challenge. As evaluation methodologies become public, AI developers can optimize their systems to perform well on known benchmarks while potentially retaining concerning capabilities that evaluations do not detect. This gaming dynamic may require continuous evolution of evaluation approaches, increasing the complexity and resource requirements for effective assessment.
Temporal dynamics add another layer of complexity. AI systems may exhibit different behavior over time as they learn from deployment interactions, receive updates, or face novel situations not represented in evaluation datasets. Current evaluation methodologies primarily assess snapshot behavior rather than evolution over time, potentially missing important safety-relevant changes.
Trajectory and Future Evolution
Section titled “Trajectory and Future Evolution”Near-Term Development (2025-2026)
Section titled “Near-Term Development (2025-2026)”The next two years will likely see continued rapid expansion of existing AISIs and establishment of new institutes across additional countries. The UK and US institutes are expected to reach their target staffing levels and develop more sophisticated evaluation capabilities. International coordination mechanisms established at recent AI safety summits will mature into operational frameworks for information sharing and joint evaluation activities.
Several technical developments will shape AISI effectiveness during this period. Evaluation methodologies will become more standardized, enabling better comparison across different systems and time periods. Automated evaluation tools may reduce the time required for comprehensive assessment, potentially addressing some timing mismatch concerns. The development of better interpretability techniques could enhance evaluators’ ability to understand system behavior and identify concerning capabilities.
However, this period may also reveal fundamental limitations of the current AISI model. As AI capabilities advance more rapidly, the gap between evaluation timelines and deployment decisions may widen. Industry consolidation could reduce the number of actors requiring evaluation while potentially making access negotiations more challenging. Political changes in key countries could disrupt funding, leadership, or international coordination efforts.
The relationship between AISIs and other governance mechanisms will evolve during this period. Integration with broader regulatory frameworks may begin, potentially providing AISIs with expanded authority or enforcement mechanisms. Alternatively, regulatory development may bypass AISIs if they are perceived as ineffective or captured by industry interests.
Medium-Term Scenarios (2026-2029)
Section titled “Medium-Term Scenarios (2026-2029)”The medium-term trajectory for AISIs depends heavily on how several critical uncertainties resolve. In optimistic scenarios, AISIs successfully demonstrate value through high-quality evaluations that inform policy decisions, gain expanded authority through legislative changes that enable enforcement action, maintain independence despite industry relationships, and establish effective international coordination that provides global oversight capacity.
Such successful development could position AISIs as central institutions in AI governance, potentially serving as verification bodies for international AI safety agreements, regulatory agencies with authority to approve or delay AI deployments, coordinating centers for technical standards development, or incident response organizations that investigate AI system failures.
However, pessimistic scenarios are equally plausible. AISIs may prove unable to keep pace with advancing capabilities, making their evaluations strategically irrelevant. Industry capture could transform them into legitimacy-providing institutions that rubber-stamp lab decisions rather than providing independent oversight. International coordination could fragment due to geopolitical tensions or divergent national interests. Political changes could defund or reorganize institutes, disrupting institutional knowledge and relationships.
Hybrid scenarios seem most likely, where AISIs provide valuable but limited contributions to AI governance. They may successfully evaluate current generation systems while struggling with more advanced capabilities. They may maintain partial independence while facing increased industry influence. They may achieve regional coordination while failing to establish global frameworks.
Long-Term Possibilities
Section titled “Long-Term Possibilities”The long-term role of AISIs will depend fundamentally on the trajectory of AI capabilities and the broader governance response. If AI development slows or reaches temporary plateaus, AISIs may have time to develop evaluation capabilities that match the systems they oversee. If international cooperation on AI governance strengthens, AISIs could become verification bodies for binding international agreements.
Alternatively, if AI development accelerates toward artificial general intelligence or superintelligence, current AISI models may prove entirely inadequate. The evaluation of systems approaching or exceeding human-level capabilities across multiple domains may require fundamentally different approaches that current institutions cannot provide.
The most transformative possibility involves AISIs evolving beyond their current evaluation focus toward active participation in AI development. Rather than merely assessing systems developed by labs, future iterations might directly fund or conduct safety-focused AI research, potentially developing alternative development approaches that prioritize safety over capability advancement.
Key Uncertainties and Research Priorities
Section titled “Key Uncertainties and Research Priorities”Fundamental Questions
Section titled “Fundamental Questions”Several critical uncertainties will determine whether AISIs can meaningfully contribute to AI safety. The independence question remains paramount: can government institutions maintain sufficient objectivity to provide effective oversight while maintaining the industry relationships necessary for access and cooperation? Historical precedents from other domains provide mixed guidance, with some regulatory agencies successfully maintaining independence while others became captured by the industries they oversee.
The authority question similarly remains unresolved. Will AISIs gain sufficient regulatory power to influence AI development decisions, or will they remain advisory institutions whose recommendations can be safely ignored? The path from advisory to regulatory authority requires political action that may not materialize, particularly if industry opposition is strong or if other governance mechanisms are perceived as more effective.
The scaling question presents perhaps the most fundamental challenge. Can evaluation capabilities advance fast enough to remain relevant as AI systems become more capable, or will the resource and timeline mismatches prove insurmountable? This question depends partly on technical developments in evaluation methodology and partly on whether regulatory approaches can alter the competitive dynamics driving rapid deployment.
Empirical Research Needs
Section titled “Empirical Research Needs”Several areas require urgent empirical investigation to inform AISI development and evaluation. Studies of regulatory capture in analogous domains could provide insights into institutional design choices that preserve independence. Comparative analysis of different AISI organizational models could identify best practices for balancing cooperation and oversight requirements.
Technical research on evaluation methodology remains critical, particularly around automated evaluation systems that could reduce assessment timelines, interpretability techniques that enable better understanding of system behavior, and methods for detecting unknown capabilities in large-scale AI systems. The development of standardized evaluation frameworks requires careful empirical validation to ensure they actually predict deployment behavior.
International relations research could illuminate the prospects for sustained coordination among AISIs, particularly how geopolitical tensions and competitive dynamics might affect information sharing and joint evaluation efforts. Historical studies of international technical cooperation in other domains could provide relevant insights.
Decision-Relevant Considerations
Section titled “Decision-Relevant Considerations”For individuals considering careers in AISIs, several factors merit careful consideration. The impact potential depends heavily on whether institutes gain meaningful authority and maintain independence. The skill development opportunities include valuable experience in AI evaluation and policy interfaces, though bureaucratic constraints may limit research flexibility.
For policymakers considering AISI funding or expansion, key considerations include whether advisory institutions provide sufficient oversight given the stakes involved, how to design institutional structures that preserve independence while enabling industry cooperation, and whether resources might be more effectively deployed through other governance mechanisms.
For AI safety researchers more broadly, AISIs represent one approach among many potential governance interventions. Their effectiveness relative to technical alignment research, industry engagement, or international treaty development remains an open question that depends partly on one’s views about the tractability of technical versus governance approaches to AI safety.
The ultimate assessment of AISIs may depend less on their current capabilities than on their potential for evolution. If they can serve as a foundation for more sophisticated governance institutions, their current limitations may prove temporary. If they become entrenched but ineffective institutions that provide false reassurance about AI oversight, their net impact could be negative. The next several years will likely determine which trajectory proves accurate.
Sources
Section titled “Sources”Official Institute Resources
Section titled “Official Institute Resources”- UK AI Security Institute↗ - Official website with research publications and Inspect framework
- US AI Safety Institute at NIST↗ - NIST’s AI Safety Institute homepage
- Introducing the AI Safety Institute↗ - UK Government overview
- Japan AI Safety Institute Launch↗ - Ministry of Economy, Trade and Industry announcement
Evaluations and Research
Section titled “Evaluations and Research”- Pre-Deployment Evaluation of Claude 3.5 Sonnet↗ - First joint UK-US evaluation (November 2024)
- Pre-Deployment Evaluation of OpenAI o1↗ - Second joint evaluation (December 2024)
- US AISI Signs Agreements with Anthropic and OpenAI↗ - August 2024 MOUs
- Inspect Evaluation Framework↗ - Open-source AI evaluation tool
- FLI AI Safety Index 2024↗ - Independent assessment of AI company safety practices
International Coordination
Section titled “International Coordination”- AI Seoul Summit 2024↗ - UK Government summit page
- Seoul Statement of Intent↗ - International network founding document
- First Meeting of the International Network↗ - EU report on San Francisco meeting (November 2024)
- CSIS: AI Safety Institute Network Recommendations↗ - Policy analysis
Analysis and Commentary
Section titled “Analysis and Commentary”- Elizabeth Kelly: TIME 100 Most Influential in AI↗ - Profile of US AISI director
- CSIS: US Vision for AI Safety↗ - Conversation with Elizabeth Kelly
- TechPolicy.Press: How AISIs Inform Governance↗ - Analysis of independence concerns
- OECD: AI Safety Institutes Challenge↗ - Assessment of institutional capacity
- AI & Society: AI Safety and Regulatory Capture↗ - Academic analysis of capture risks
2025 Developments
Section titled “2025 Developments”- UK Renames AI Safety Institute↗ - February 2025 rebrand to AI Security Institute
- Trump Administration Rebrands US AI Safety Institute↗ - June 2025 change to CAISI
- AI Safety Advocates Slam NIST Targeting↗ - Criticism of proposed staff cuts
- TRAINS Taskforce Established↗ - National security testing initiative
AI Transition Model Context
Section titled “AI Transition Model Context”AI Safety Institutes improve the Ai Transition Model through Civilizational Competence:
| Factor | Parameter | Impact |
|---|---|---|
| Civilizational Competence | Regulatory Capacity | Provide government with technical expertise to evaluate frontier AI |
| Civilizational Competence | Institutional Quality | Build dedicated infrastructure for model evaluation and safety testing |
| Misalignment Potential | Human Oversight Quality | Pre-deployment access enables detection of dangerous capabilities |
AISIs address critical information asymmetry but face severe resource constraints (100+ staff vs thousands at labs) and advisory-only authority.