Skip to content

Organizations

This section profiles organizations working on AI development, AI safety research, and AI governance. Understanding these organizations is crucial for comprehending the AI safety landscape, as they represent different approaches, philosophies, and theories of change.

Organizations developing the most advanced AI systems. These labs combine capability research with varying levels of safety work:

  • Anthropic - AI safety company developing Claude. Founded 2021 by former OpenAI researchers. Emphasizes Constitutional AI, interpretability, and Responsible Scaling Policy. Raised $7.3B+ from Amazon, Google. ~1,000 employees.

  • OpenAI - Developer of GPT and ChatGPT. Founded 2015 as non-profit, transitioned to capped-profit 2019. Pioneered RLHF alignment technique. Partnership with Microsoft ($14B+). Experienced governance crisis November 2023 and safety researcher departures 2024. ~3,000 employees.

  • Google DeepMind - Formed 2023 from merger of DeepMind and Google Brain. Known for AlphaGo, AlphaFold (2024 Nobel Prize), and Gemini. Combines world-class research with Google’s massive resources. Emphasizes scientific applications and reinforcement learning. ~3,000+ employees.

Organizations primarily focused on AI safety research, with minimal or no capability development:

  • MIRI - Machine Intelligence Research Institute. Founded 2000, oldest AI safety organization. Pioneered foundational concepts (instrumental convergence, orthogonality thesis). Agent foundations era (2014-2020), then pivoted to governance advocacy (2023+). Now recommends governance careers over technical alignment. ~20 employees.

  • ARC - Alignment Research Center. Founded 2021 by Paul Christiano. Two divisions: ARC Theory (Eliciting Latent Knowledge research) and ARC Evals (dangerous capability evaluations). Influential in establishing evaluation practices at frontier labs. Worst-case alignment focus. ~15-20 employees.

  • Redwood Research - Founded 2021. Known for mechanistic interpretability work (causal scrubbing) and pioneering “AI control” agenda. Assumes alignment may not be solved, focuses on safety even with misaligned AI. Empirical focus, willingness to pivot. ~20-25 employees.

Organizations focused on AI governance, policy, and coordination:

  • GovAI - Centre for the Governance of AI. Academic research on AI governance challenges, international coordination, and policy frameworks.

  • CAIS - Center for AI Safety. Combines research and advocacy on AI safety. Focus on making safety concerns legible to policymakers and public.

University-based AI safety research:

  • CHAI - Center for Human-Compatible AI at UC Berkeley. Founded by Stuart Russell. Researches value alignment, cooperative inverse reinforcement learning, and human-compatible AI systems.
TypePrimary FocusExamplesKey Characteristics
Frontier LabCapability + SafetyAnthropic, OpenAI, DeepMindMassive resources, commercial pressure, product development, frontier access
Safety ResearchTechnical SafetyMIRI, ARC, RedwoodGrant-funded, research focus, no deployment, varying levels of pessimism
GovernancePolicy & CoordinationGovAI, CAISAcademic or non-profit, policy engagement, international focus
AcademicResearch & TrainingCHAIUniversity-based, peer review, student training, slower timelines

Organizations differ significantly in their approaches and beliefs:

  • Short (<2030): MIRI, some Anthropic/OpenAI leadership
  • Medium (2030s-2040s): ARC, much of safety community
  • Longer (2040s+): Some academics
  • Extremely Hard: MIRI (pivoted away from technical work)
  • Very Hard: ARC, Redwood (worst-case focus)
  • Hard But Tractable: Anthropic, DeepMind (continued technical optimism)
  • Prosaic Approaches May Work: Some at OpenAI, optimists
  • Necessary: Anthropic, OpenAI, DeepMind (need frontier systems for safety research)
  • Dangerous: MIRI, some critics (accelerates capabilities)
  • Mixed: ARC Evals (evaluate but don’t build)
  • Empirical/Iterative: Anthropic (Constitutional AI), OpenAI (RLHF), DeepMind
  • Worst-Case/Adversarial: ARC, Redwood (AI Control)
  • Theoretical Foundations: MIRI (historically)
  • Evaluations: ARC Evals, increasing focus across labs
  • For-Profit Labs: OpenAI, Anthropic (need revenue)
  • Corporate Subsidiaries: DeepMind (Google’s resources and pressure)
  • Grant-Funded: MIRI, ARC, Redwood (independent but resource-constrained)
  • ChatGPT (Nov 2022) triggered industry-wide acceleration
  • Frontier labs competing on capabilities
  • Safety work must keep pace or falls behind
  • Question: Does competition prevent adequate safety margins?
  • MIRI → Anthropic: Evan Hubinger
  • OpenAI → Anthropic: Dario Amodei, Daniela Amodei, Chris Olah, Tom Brown, Jan Leike, John Schulman
  • OpenAI → ARC: Paul Christiano
  • Redwood → DeepMind: Neel Nanda
  • Pattern: Movement toward organizations with stronger safety focus or from frontier labs to research orgs
  • MIRI: Agent foundations → empirical work → governance advocacy
  • Redwood: Adversarial robustness → interpretability → AI control
  • Many orgs: Adjusting strategies as timelines shorten
  • ARC Evals pioneered systematic evaluations
  • Now adopted by OpenAI (Preparedness), Anthropic (RSP), DeepMind (Frontier Safety)
  • Government involvement: UK AISI, US executive order
  • Becoming key governance tool
  • OpenAI: Governance crisis (Nov 2023), safety departures (2024)
  • Anthropic: Large funding rounds, pressure to compete
  • DeepMind: Merger ended independence, Google integration
  • Tension: Financial sustainability vs safety prioritization
  • Can frontier labs maintain safety focus under commercial pressure?
  • Do evaluations provide adequate safety assurance?
  • Will technical alignment be solved, or do we need control/governance approaches?
  • Should organizations prioritize differential progress (safety over capabilities)?
  • How much should we update on MIRI’s pivot away from technical work?
  • Can independent research orgs compete with well-resourced labs?
  • Will regulatory pressure change organizational incentives?

Each profile includes:

  • InfoBox: Quick facts (founded, location, funding, headcount, website)
  • History: Formation, evolution, strategic shifts
  • Approach: Safety philosophy, research methods, key assumptions
  • Key People: Leadership, notable researchers, departures
  • Research/Products: Major contributions, publications, systems
  • Funding: Resources, partnerships, commercial dynamics
  • Criticisms: Controversies, debates, concerns
  • Comparisons: How org relates to others in ecosystem
  • Estimates: Timeline and risk assessments (where public)
  • Sources: Links to official materials and research
FRONTIER LABS
(Build Capabilities)
|
┌──────────────────┼──────────────────┐
| | |
OpenAI Anthropic DeepMind
(Products) (Safety-Focused) (Research + Google)
| | |
└──────────────────┴──────────────────┘
|
Evaluated by / Inform
|
SAFETY RESEARCH
(Understand Risks)
|
┌──────────────────┼──────────────────┐
| | |
ARC Redwood MIRI
(Theory+Evals) (Empirical) (Now Governance)
| | |
└──────────────────┴──────────────────┘
|
Inform / Advise
|
GOVERNANCE
(Policy & Coordination)
|
┌──────────┴──────────┐
| |
GovAI CAIS
(Research) (Advocacy)

This ecosystem involves:

  • Information flow: Research → Labs, Evals → Policy
  • Talent flow: Researchers moving between orgs
  • Resource flow: Funding, compute access, partnerships
  • Influence flow: Publications, advocacy, standard-setting

For comprehensive understanding, read organization profiles in conjunction with: