Library

The AI Safety Map

A unified map of catalog episodes, Spotlight queue picks, TED talks, in-site editorials (briefings and maps), and the same shelf and intent filters across all of them. Leader profiles and timelines live under Leaders Watch — use the pill below — and are also included in site search.

882 library items Perspective Map framework Leaders Watch5 guided shelvesSorted A-Z by source title

Filtered view active.I need technical alignment depthReturn to full Library

What do you want to focus on?

Route by intent for a guided shortlist, or retrieve directly by keyword and theme in one place.

Focus matched: I need technical alignment depth

Active filters

Clear filters

Click an active filter to jump straight to that section and adjust it.

Edit filters

Expand a filter to review options. Only one section stays open for focus.

Start Here

Question-led pathways with a tightly curated shortlist from across podcasts, documents, and talks.

I need technical alignment depth

Deep technical material on evals, interpretability, and control methods for reducing frontier-system risk.

Path active (12)

Civilisational risk and strategy

AXRP

20 Jan 2025

Adria Garriga-Alonso on Detecting AI Scheming

This conversation examines core safety through Adria Garriga-Alonso on Detecting AI Scheming, surfacing the assumptions, failure paths, and strategic choices that matter most for real-world deployment.

Spectrum + transcript · tap

Slice bands

Spectrum trail (transcript)

Med 0 · avg -2 · 25 segs

ai-safetyaxrpcore-safety

AXRP

11 Apr 2024

AI Control with Buck Shlegeris and Ryan Greenblatt

This conversation examines technical alignment through AI Control with Buck Shlegeris and Ryan Greenblatt, surfacing the assumptions, failure paths, and strategic choices that matter most for real-world deployment.

Spectrum + transcript · tap

Slice bands

Spectrum trail (transcript)

Med -6 · avg -9 · 174 segs

Signal RoomFeaturedai-safetycontrolaxrp

AXRP

1 Mar 2025

David Duvenaud on Sabotage Evaluations and the Post-AGI Future

This conversation examines technical alignment through David Duvenaud on Sabotage Evaluations and the Post-AGI Future, surfacing the assumptions, failure paths, and strategic choices that matter most for real-world deployment.

Spectrum + transcript · tap

Slice bands

Spectrum trail (transcript)

Med -9 · avg -7 · 21 segs

ai-safetyevalsaxrp

AXRP

1 Dec 2024

Evan Hubinger on Model Organisms of Misalignment

This conversation examines technical alignment through Evan Hubinger on Model Organisms of Misalignment, surfacing the assumptions, failure paths, and strategic choices that matter most for real-world deployment.

Spectrum + transcript · tap

Slice bands

Spectrum trail (transcript)

Med -6 · avg -7 · 120 segs

Signal RoomFeaturedai-safetyalignmentaxrp

AXRP

31 Mar 2022

First Principles of AGI Safety with Richard Ngo

Auto-discovered from AXRP. Editorial summary pending review.

Spectrum + transcript · tap

Slice bands

Spectrum trail (transcript)

Med -6 · avg -8 · 79 segs

ai-safetyaxrp

Governance, institutions, and power

AXRP

28 Jul 2024

AI Evaluations with Beth Barnes

This conversation examines technical alignment through AI Evaluations with Beth Barnes, surfacing the assumptions, failure paths, and strategic choices that matter most for real-world deployment.

Spectrum + transcript · tap

Slice bands

Spectrum trail (transcript)

Med 0 · avg -4 · 120 segs

Signal RoomFeaturedai-safetyevalsaxrp

Technical alignment and control

80,000 Hours Podcast

2 Oct 2018

Paul Christiano on how OpenAI is developing real solutions to the AI alignment problem, and his vision of how humanity will progressively hand over decision-making to AI systems

This conversation examines technical alignment through Paul Christiano on how OpenAI is developing real solutions to the AI alignment problem, and his vision of how humanity will progressively hand over decision-making to AI systems, surfacing the assumptions, failure paths, and strategic choices that matter most for real-world deployment.

Spectrum + transcript · tap

Slice bands

Spectrum trail (transcript)

Med 0 · avg -4 · 283 segs

ai-safetyalignment80000-hours

AXRP

2 May 2023

Interpretability for Engineers with Stephen Casper

This conversation examines technical alignment through Interpretability for Engineers with Stephen Casper, surfacing the assumptions, failure paths, and strategic choices that matter most for real-world deployment.

Spectrum + transcript · tap

Slice bands

Spectrum trail (transcript)

Med 0 · avg -4 · 108 segs

ai-safetyaxrptechnical-alignment

AXRP

4 Feb 2023

Mechanistic Interpretability with Neel Nanda

This conversation examines technical alignment through Mechanistic Interpretability with Neel Nanda, surfacing the assumptions, failure paths, and strategic choices that matter most for real-world deployment.

Spectrum + transcript · tap

Slice bands

Spectrum trail (transcript)

Med 0 · avg -1 · 182 segs

ai-safetyaxrptechnical-alignment

AXRP

12 Apr 2023

Reform AI Alignment with Scott Aaronson

This conversation examines technical alignment through Reform AI Alignment with Scott Aaronson, surfacing the assumptions, failure paths, and strategic choices that matter most for real-world deployment.

Spectrum + transcript · tap

Slice bands

Spectrum trail (transcript)

Med 0 · avg -5 · 120 segs

ai-safetyalignmentaxrp

AXRP

27 Jul 2023

Superalignment with Jan Leike

This conversation examines technical alignment through Superalignment with Jan Leike, surfacing the assumptions, failure paths, and strategic choices that matter most for real-world deployment.

Spectrum + transcript · tap

Slice bands

Spectrum trail (transcript)

Med -10 · avg -7 · 112 segs

ai-safetyalignmentaxrp

Future of Life Institute Podcast

6 Feb 2026

Can AI Do Our Alignment Homework? (with Ryan Kidd)

This conversation examines technical alignment through Can AI Do Our Alignment Homework? (with Ryan Kidd), surfacing the assumptions, failure paths, and strategic choices that matter most for real-world deployment.

Spectrum + transcript · tap

Slice bands

Spectrum trail (transcript)

Med -6 · avg -6 · 121 segs

ai-safetyalignmentfli