Audible Premium Plus. $0.99/mo for the first 3 months + $20 Audible credits. 12 days only. Get this deal. Cancel anytime.

#76 – Joe Carlsmith on Scheming AI
Mar 16 2024
Length: 1 hr and 52 mins
Podcast

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Please try again

Unfollow podcast failed

Please try again

#76 – Joe Carlsmith on Scheming AI

Listen for free

View show details

Summary
Joe Carlsmith is a writer, researcher, and philosopher. He works as a senior research analyst at Open Philanthropy, where he focuses on existential risk from advanced artificial intelligence. He also writes independently about various topics in philosophy and futurism, and holds a doctorate in philosophy from the University of Oxford.

You can find links and a transcript at www.hearthisidea.com/episodes/carlsmith

In this episode we talked about a report Joe recently authored, titled ‘Scheming AIs: Will AIs fake alignment during training in order to get power?’. The report “examines whether advanced AIs that perform well in training will be doing so in order to gain power later”; a behaviour Carlsmith calls scheming.

We talk about:

Distinguishing ways AI systems can be deceptive and misaligned
Why powerful AI systems might acquire goals that go beyond what they’re trained to do, and how those goals could lead to scheming
Why scheming goals might perform better (or worse) in training than less worrying goals
The ‘counting argument’ for scheming AI
Why goals that lead to scheming might be simpler than the goals we intend
Things Joe is still confused about, and research project ideas

You can get in touch through our website or on Twitter. Consider leaving us an honest review wherever you're listening to this — it's the best free way to support the show. Thanks for listening!
Show more Show less

Show more Show less

What listeners say about #76 – Joe Carlsmith on Scheming AI

Average customer ratings

Reviews - Please select the tabs below to change the source of reviews.

Audible.com reviews

Amazon reviews

No Reviews are Available

Report a review on Amazon