Connor Leahy
Connor Leahy
Background
Section titled “Background”Connor Leahy is the CEO and co-founder of Conjecture, an AI safety company focused on interpretability and “prosaic” approaches to AGI alignment. He represents a new generation of AI safety researchers who are building organizations specifically to tackle alignment.
Background:
- Largely self-taught in AI and machine learning
- Co-founder of EleutherAI (open-source AI research collective)
- Founded Conjecture in 2022
- Active public communicator on AI risk
Leahy’s journey from open-source AI contributor to safety company founder reflects growing concern about AI risks among those building the technology.
From EleutherAI to Conjecture
Section titled “From EleutherAI to Conjecture”EleutherAI
Section titled “EleutherAI”Co-founded EleutherAI, which:
- Created GPT-Neo and GPT-J (open-source language models)
- Demonstrated capabilities research outside major labs
- Showed small teams could train large models
- Made AI research more accessible
The shift: Working on capabilities research convinced Leahy that AI risk was severe and urgent.
Why Conjecture?
Section titled “Why Conjecture?”Founded Conjecture because:
- Believed prosaic AGI was coming soon
- Thought existing safety work insufficient
- Wanted to work on alignment with urgency
- Needed independent organization focused solely on safety
Conjecture’s Approach
Section titled “Conjecture’s Approach”Mission
Section titled “Mission”Conjecture aims to:
- Understand how AI systems work (interpretability)
- Build safely aligned AI systems
- Prevent catastrophic outcomes from AGI
- Work at frontier of capabilities to ensure safety relevance
Research Focus
Section titled “Research Focus”Interpretability:
- Understanding neural networks mechanistically
- Automated interpretability methods
- Scaling understanding to large models
Alignment:
- Prosaic alignment techniques
- Testing alignment on current systems
- Building aligned systems from scratch
Capability evaluation:
- Understanding what models can really do
- Detecting dangerous capabilities early
- Red-teaming and adversarial testing
Views on AI Risk
Section titled “Views on AI Risk”Based on public statements and interviews
| Source | Estimate | Date |
|---|---|---|
| AGI timeline | Could be 2-5 years | 2023 |
| P(doom) | High without major changes | 2023 |
| Urgency | Extreme | 2024 |
AGI timeline: Believes AGI could arrive very soon
P(doom): Very concerned about default outcomes
Urgency: Emphasizes need for immediate action
Core Beliefs
Section titled “Core Beliefs”- AGI is very near: Could be 2-10 years, possibly sooner
- Default outcome is bad: Without major changes, things go poorly
- Prosaic alignment is crucial: Need to align systems similar to current ones
- Interpretability is essential: Can’t align what we don’t understand
- Need to move fast: Limited time before dangerous capabilities emerge
On Timelines
Section titled “On Timelines”Leahy is notably more pessimistic about timelines than most:
- Believes AGI could be very close
- Points to rapid capability gains
- Sees fewer barriers than many assume
- Emphasizes uncertainty but leans short
Strategic Position
Section titled “Strategic Position”Different from slowdown advocates:
- Doesn’t think we’ll successfully slow down
- Believes we need solutions that work in fast-moving world
- Focuses on technical alignment over governance alone
Different from race-to-the-top:
- Very concerned about safety
- Skeptical of “building AGI to solve alignment”
- Wants fundamental understanding first
Public Communication
Section titled “Public Communication”Vocal AI Safety Advocate
Section titled “Vocal AI Safety Advocate”Leahy is very active in public discourse:
- Regular podcast appearances
- Social media presence (Twitter/X)
- Interviews and talks
- Blog posts and essays
Key Messages
Section titled “Key Messages”On urgency:
- AGI could arrive much sooner than people think
- We’re not prepared
- Need to take this seriously now
On capabilities:
- Current systems are more capable than commonly believed
- Emergent capabilities make prediction hard
- Safety must account for rapid jumps
On solutions:
- Need mechanistic understanding
- Can’t rely on empirical tinkering alone
- Interpretability is make-or-break
Communication Style
Section titled “Communication Style”Known for:
- Direct, sometimes blunt language
- Willingness to express unpopular views
- Engaging in debates
- Not mincing words about risks
Research Philosophy
Section titled “Research Philosophy”Interpretability First
Section titled “Interpretability First”Believes:
- Can’t safely deploy what we don’t understand
- Black-box approaches fundamentally insufficient
- Need to open the black box before scaling further
- Interpretability isn’t optional
Prosaic Focus
Section titled “Prosaic Focus”Working on:
- Systems similar to current architectures
- Alignment techniques that work today
- Scaling understanding to larger models
- Not waiting for theoretical breakthroughs
Empirical Approach
Section titled “Empirical Approach”Emphasizes:
- Testing ideas on real systems
- Learning from current models
- Rapid iteration
- Building working systems
Conjecture’s Work
Section titled “Conjecture’s Work”Research Areas
Section titled “Research Areas”Automated Interpretability:
- Using AI to help understand AI
- Scaling interpretability techniques
- Finding circuits and features automatically
Capability Evaluation:
- Understanding what models can do
- Red-teaming frontier systems
- Developing evaluation frameworks
Alignment Testing:
- Empirical evaluation of alignment techniques
- Stress-testing proposed solutions
- Finding failure modes
Public Output
Section titled “Public Output”Conjecture has:
- Published research on interpretability
- Released tools for safety research
- Engaged in public discourse
- Contributed to alignment community
Influence and Impact
Section titled “Influence and Impact”Raising Urgency
Section titled “Raising Urgency”Leahy’s advocacy has:
- Brought attention to short timelines
- Emphasized severity of risk
- Recruited people to safety work
- Influenced discourse on urgency
Building Alternative Model
Section titled “Building Alternative Model”Conjecture demonstrates:
- Can build safety-focused company
- Don’t need to be at frontier labs
- Independent safety research viable
- Multiple organizational models possible
Community Engagement
Section titled “Community Engagement”Active in:
- Alignment research community
- Public communication about AI risk
- Mentoring and advising
- Connecting researchers
Criticism and Debates
Section titled “Criticism and Debates”Critics argue:
- May be too pessimistic about timelines
- Some statements are inflammatory
- Conjecture’s approach might not scale
- Public communication sometimes counterproductive
Supporters argue:
- Better to be cautious about timelines
- Direct communication is valuable
- Conjecture doing important work
- Field needs diverse voices
Leahy’s position:
- Prefers to be wrong about urgency than complacent
- Believes directness is necessary
- Open to criticism and debate
- Focused on solving problem
Evolution of Views
Section titled “Evolution of Views”EleutherAI era:
- Focused on democratizing AI
- Excited about capabilities
- Less concerned about risk
Transition:
- Growing concern from working with models
- Seeing rapid capability gains
- Understanding alignment difficulty
Current:
- Very concerned about risk
- Focused entirely on safety
- Urgent timeline beliefs
- Public advocacy
Current Priorities
Section titled “Current Priorities”At Conjecture:
- Interpretability research: Understanding how models work
- Capability evaluation: Knowing what’s possible
- Alignment testing: Validating proposed solutions
- Public communication: Raising awareness
- Team building: Growing safety research capacity
Key Insights
Section titled “Key Insights”From Building Capabilities to Safety
Section titled “From Building Capabilities to Safety”Leahy’s experience building language models convinced him:
- Capabilities can surprise
- Scaling works better than expected
- Safety is harder than it looks
- Need fundamental understanding
On the Field
Section titled “On the Field”Observations about AI safety:
- Not enough urgency
- Too much theorizing, not enough empirical work
- Need more attempts at solutions
- Can’t wait for perfect understanding
Related Pages
Section titled “Related Pages”What links here
- Conjecturelab-research