• The 80000 Hours Podcast on Artificial Intelligence

  • De: 80k
  • Podcast

The 80000 Hours Podcast on Artificial Intelligence

De: 80k
  • Sumário

  • A compilation of ten key episodes on artificial intelligence and related topics from 80,000 Hours. Together they'll help you learn about how AI looks from a broadly longtermist, existential risk, or effective altruism flavoured point of view.
    Copyright held by 80k
    Exibir mais Exibir menos
Episódios
  • Zero: What to expect in this series
    Sep 2 2023

    A short introduction to what you'll get out of these episodes!

    Exibir mais Exibir menos
    2 minutos
  • One: Brian Christian on the alignment problem
    Sep 2 2023

    Originally released in March 2021.

    Brian Christian is a bestselling author with a particular knack for accurately communicating difficult or technical ideas from both mathematics and computer science.

    Listeners loved our episode about his book Algorithms to Live By — so when the team read his new book, The Alignment Problem, and found it to be an insightful and comprehensive review of the state of the research into making advanced AI useful and reliably safe, getting him back on the show was a no-brainer.

    Brian has so much of substance to say this episode will likely be of interest to people who know a lot about AI as well as those who know a little, and of interest to people who are nervous about where AI is going as well as those who aren't nervous at all.

    Links to learn more, summary and full transcript.

    Here’s a tease of 10 Hollywood-worthy stories from the episode:

    • The Riddle of Dopamine: The development of reinforcement learning solves a long-standing mystery of how humans are able to learn from their experience.
    • ALVINN: A student teaches a military vehicle to drive between Pittsburgh and Lake Erie, without intervention, in the early 1990s, using a computer with a tenth the processing capacity of an Apple Watch.
    • Couch Potato: An agent trained to be curious is stopped in its quest to navigate a maze by a paralysing TV screen.
    • Pitts & McCulloch: A homeless teenager and his foster father figure invent the idea of the neural net.
    • Tree Senility: Agents become so good at living in trees to escape predators that they forget how to leave, starve, and die.
    • The Danish Bicycle: A reinforcement learning agent figures out that it can better achieve its goal by riding in circles as quickly as possible than reaching its purported destination.
    • Montezuma's Revenge: By 2015 a reinforcement learner can play 60 different Atari games — the majority impossibly well — but can’t score a single point on one game humans find tediously simple.
    • Curious Pong: Two novelty-seeking agents, forced to play Pong against one another, create increasingly extreme rallies.
    • AlphaGo Zero: A computer program becomes superhuman at Chess and Go in under a day by attempting to imitate itself.
    • Robot Gymnasts: Over the course of an hour, humans teach robots to do perfect backflips just by telling them which of 2 random actions look more like a backflip.

    We also cover:

    • How reinforcement learning actually works, and some of its key achievements and failures
    • How a lack of curiosity can cause AIs to fail to be able to do basic things
    • The pitfalls of getting AI to imitate how we ourselves behave
    • The benefits of getting AI to infer what we must be trying to achieve
    • Why it’s good for agents to be uncertain about what they're doing
    • Why Brian isn’t that worried about explicit deception
    • The interviewees Brian most agrees with, and most disagrees with
    • Developments since Brian finished the manuscript
    • The effective altruism and AI safety communities
    • And much more

    Producer: Keiran Harris.
    Audio mastering: Ben Cordell.
    Transcriptions: Sofia Davis-Fogel.

    Exibir mais Exibir menos
    2 horas e 56 minutos
  • Two: Ajeya Cotra on accidentally teaching AI models to deceive us
    Sep 2 2023

    Originally released in May 2023.

    Imagine you are an orphaned eight-year-old whose parents left you a $1 trillion company, and no trusted adult to serve as your guide to the world. You have to hire a smart adult to run that company, guide your life the way that a parent would, and administer your vast wealth. You have to hire that adult based on a work trial or interview you come up with. You don't get to see any resumes or do reference checks. And because you're so rich, tonnes of people apply for the job — for all sorts of reasons.

    Today's guest Ajeya Cotra — senior research analyst at Open Philanthropy — argues that this peculiar setup resembles the situation humanity finds itself in when training very general and very capable AI models using current deep learning methods.

    Links to learn more, summary and full transcript.

    As she explains, such an eight-year-old faces a challenging problem. In the candidate pool there are likely some truly nice people, who sincerely want to help and make decisions that are in your interest. But there are probably other characters too — like people who will pretend to care about you while you're monitoring them, but intend to use the job to enrich themselves as soon as they think they can get away with it.

    Like a child trying to judge adults, at some point humans will be required to judge the trustworthiness and reliability of machine learning models that are as goal-oriented as people, and greatly outclass them in knowledge, experience, breadth, and speed. Tricky!

    Can't we rely on how well models have performed at tasks during training to guide us? Ajeya worries that it won't work. The trouble is that three different sorts of models will all produce the same output during training, but could behave very differently once deployed in a setting that allows their true colours to come through. She describes three such motivational archetypes:

    • Saints — models that care about doing what we really want
    • Sycophants — models that just want us to say they've done a good job, even if they get that praise by taking actions they know we wouldn't want them to
    • Schemers — models that don't care about us or our interests at all, who are just pleasing us so long as that serves their own agenda

    And according to Ajeya, there are also ways we could end up actively selecting for motivations that we don't want.

    In today's interview, Ajeya and Rob discuss the above, as well as:

    • How to predict the motivations a neural network will develop through training
    • Whether AIs being trained will functionally understand that they're AIs being trained, the same way we think we understand that we're humans living on planet Earth
    • Stories of AI misalignment that Ajeya doesn't buy into
    • Analogies for AI, from octopuses to aliens to can openers
    • Why it's smarter to have separate planning AIs and doing AIs
    • The benefits of only following through on AI-generated plans that make sense to human beings
    • What approaches for fixing alignment problems Ajeya is most excited about, and which she thinks are overrated
    • How one might demo actually scary AI failure mechanisms

    Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript below.

    Producer: Keiran Harris

    Audio mastering: Ryan Kessler and Ben Cordell

    Transcriptions: Katy Moore


    Exibir mais Exibir menos
    2 horas e 50 minutos
activate_samplebutton_t1

O que os ouvintes dizem sobre The 80000 Hours Podcast on Artificial Intelligence

Nota média dos ouvintes. Apenas ouvintes que tiverem escutado o título podem escrever avaliações.

Avaliações - Selecione as abas abaixo para mudar a fonte das avaliações.