diff --git "a/docs/karpathy-lex-pod/karpathy-pod.txt" "b/docs/karpathy-lex-pod/karpathy-pod.txt"
new file mode 100644--- /dev/null
+++ "b/docs/karpathy-lex-pod/karpathy-pod.txt"
@@ -0,0 +1,2822 @@
+some kind of a crazy quantum mechanical system that somehow gives you buffer overflow, somehow
+gives you a rounding error in the floating point.
+Synthetic intelligences are kind of like the next stage of development.
+And I don't know where it leads to.
+Like at some point, I suspect the universe is some kind of a puzzle.
+These synthetic AIs will uncover that puzzle and solve it.
+The following is a conversation with Andrei Kapathe, previously the director of AI at
+Tesla, and before that at OpenAI and Stanford.
+He is one of the greatest scientists, engineers, and educators in the history of artificial
+intelligence.
+This is the Lex Friedman podcast.
+To support it, please check out our sponsors.
+And now, dear friends, here's Andrei Kapathe.
+What is a neural network?
+And why does it seem to do such a surprisingly good job of learning?
+What is a neural network?
+It's a mathematical abstraction of the brain.
+I would say that's how it was originally developed.
+At the end of the day, it's a mathematical expression.
+It's a fairly simple mathematical expression when you get down to it.
+It's basically a sequence of matrix multipliers, which are really dot products mathematically,
+and some non-linearity is thrown in.
+It's a very simple mathematical expression, and it's got knobs in it.
+Many knobs.
+Many knobs.
+These knobs are loosely related to the synapses in your brain.
+They're trainable, they're modifiable.
+The idea is we need to find the setting of the knobs that makes the neural net do whatever
+you want it to do, like classify images and so on.
+There's not too much mystery, I would say, in it.
+You might think that you don't want to endow it with too much meaning with respect to the
+brain and how it works.
+It's really just a complicated mathematical expression with knobs, and those knobs need
+a proper setting for it to do something desirable.
+Yeah, but poetry is just a collection of letters with spaces, but it can make us feel a certain
+way.
+In that same way, when you get a large number of knobs together, whether it's inside the
+brain or inside a computer, they seem to surprise us with their power.
+I think that's fair.
+I'm underselling it by a lot because you definitely do get very surprising emergent behaviors
+out of these neural nets when they're large enough and trained on complicated enough problems,
+like say for example, the next word prediction in a massive data set from the internet.
+These neural nets take on pretty surprising magical properties.
+Yeah, I think it's interesting how much you can get out of even very simple mathematical
+formalism.
+When your brain right now is talking, is it doing next word prediction?
+Or is it doing something more interesting?
+Well, it's definitely some kind of a generative model that's a GPT-like and prompted by you.
+So you're giving me a prompt and I'm kind of responding to it in a generative way.
+And by yourself perhaps a little bit?
+Are you adding extra prompts from your own memory inside your head?
+It definitely feels like you're referencing some kind of a declarative structure of memory
+and so on, and then you're putting that together with your prompt and giving away some answer.
+How much of what you just said has been said by you before?
+Nothing basically, right?
+No, but if you actually look at all the words you've ever said in your life and you do a
+search, you'll probably have said a lot of the same words in the same order before.
+Yeah, could be.
+I mean, I'm using phrases that are common, et cetera, but I'm remixing it into a pretty
+unique sentence at the end of the day.
+But you're right, definitely there's a ton of remixing.
+Why, you didn't, it's like Magnus Carlsen said, I'm rated 2,900 whatever, which is pretty
+decent.
+I think you're talking very, you're not giving enough credit to neural nets here.
+Why do they seem to, what's your best intuition about this emergent behavior?
+I mean, it's kind of interesting because I'm simultaneously underselling them, but I also
+feel like there's an element to which I'm over, like it's actually kind of incredible
+that you can get so much emergent magical behavior out of them despite them being so
+simple mathematically.
+So I think those are kind of like two surprising statements that are kind of juxtaposed together.
+And I think basically what it is, is we are actually fairly good at optimizing these neural
+nets.
+And when you give them a hard enough problem, they are forced to learn very interesting
+solutions in the optimization.
+And those solution basically have these emergent properties that are very interesting.
+There's wisdom and knowledge in the knobs.
+And so this representation that's in the knobs, does it make sense to you intuitively
+the large number of knobs can hold the representation that captures some deep wisdom about the data
+it has looked at?
+It's a lot of knobs.
+It's a lot of knobs.
+And somehow, you know, so speaking concretely, one of the neural nets that people are very
+excited about right now are GPTs, which are basically just next word prediction networks.
+So you consume a sequence of words from the internet and you try to predict the next word.
+And once you train these on a large enough data set, you can basically prompt these neural
+nets in arbitrary ways and you can ask them to solve problems and they will.
+So you can just tell them, you can make it look like you're trying to solve some kind
+of a mathematical problem and they will continue what they think is the solution based on what
+they've seen on the internet.
+And very often those solutions look very remarkably consistent, look correct potentially.
+Do you still think about the brain side of it?
+So as neural nets is an abstraction or mathematical abstraction of the brain, you still draw wisdom
+from the biological neural networks or even the bigger question.
+So you're a big fan of biology and biological computation.
+What impressive thing is biology doing to you that computers are not yet?
+That gap?
+I would say I'm definitely on, I'm much more hesitant with the analogies to the brain than
+I think you would see potentially in the field.
+And I kind of feel like certainly the way neural networks started is everything stemmed
+from inspiration by the brain.
+But at the end of the day, the artifacts that you get after training, they are arrived
+at by a very different optimization process than the optimization process that gave rise
+to the brain.
+And so I think, I kind of think of it as a very complicated alien artifact.
+It's something different.
+The brain?
+I'm sorry, the neural nets that we're training.
+They are complicated alien artifact.
+I do not make analogies to the brain because I think the optimization process that gave
+rise to it is very different from the brain.
+There was no multi-agent self-play kind of setup in evolution.
+It was an optimization that is basically what amounts to a compression objective on a massive
+amount of data.
+Okay.
+So artificial neural networks are doing compression and biological neural networks are not really
+doing anything.
+They're an agent in a multi-agent self-play system that's been running for a very, very
+long time.
+That said, evolution has found that it is very useful to predict and have a predictive
+model in the brain.
+And so I think our brain utilizes something that looks like that as a part of it, but
+it has a lot more gadgets and gizmos and value functions and ancient nuclei that are all
+trying to make you survive and reproduce and everything else.
+And the whole thing through embryogenesis is built from a single cell.
+It's just the code is inside the DNA and it just builds it up like the entire organism
+with arms and the head and legs.
+And it does it pretty well.
+It should not be possible.
+So there's some learning going on.
+There's some kind of computation going through that building process.
+I don't know where, if you were just to look at the entirety of history of life on earth,
+what do you think is the most interesting invention?
+Is it the origin of life itself?
+Is it just jumping to eukaryotes?
+Is it mammals?
+Is it humans themselves, almost sapiens?
+The origin of intelligence or highly complex intelligence?
+Or is it all just a continuation of the same kind of process?
+Certainly I would say it's an extremely remarkable story that I'm only briefly learning about
+recently.
+It's a way from actually like you almost have to start at the formation of earth and all
+of its conditions and the entire solar system and how everything is arranged with Jupiter
+and moon and the habitable zone and everything.
+And then you have an active earth that's turning over material.
+And then you start with a bio genesis and everything.
+So it's all like a pretty remarkable story.
+I'm not sure that I can pick like a single unique piece of it that I find most interesting.
+I guess for me as an artificial intelligence researcher, it's probably the last piece.
+We have lots of animals that are not building technological society, but we do.
+And it seems to have happened very quickly.
+It seems to have happened very recently.
+And something very interesting happened there that I don't fully understand.
+I almost understand everything else, I think intuitively, but I don't understand exactly
+that part and how quick it was.
+Both explanations will be interesting.
+One is that this is just a continuation of the same kind of process.
+There's nothing special about humans.
+That would be deeply understanding.
+That would be very interesting that we think of ourselves as special, but it was obvious.
+It was already written in the code that you would have greater and greater intelligence
+emerging.
+And then the other explanation, which is something truly special happened, something like a rare
+event, whether it's like crazy rare event like Space Odyssey.
+What would it be?
+See, if you say like the invention of fire or the, as Richard Wrangham says, the beta
+males deciding a clever way to kill the alpha males by collaborating.
+So just optimizing the collaboration, the multi-agent aspect of the multi-agent and
+that really being constrained on resources and trying to survive the collaboration aspect
+is what created the complex intelligence.
+But it seems like it's a natural algorithm, the evolutionary process.
+What could possibly be a magical thing that happened, like a rare thing that would say
+that humans are actually human level intelligence, actually a really rare thing in the universe?
+Yeah, I'm hesitant to say that it is rare by the way, but it definitely seems like it's
+kind of like a punctuated equilibrium where you have lots of exploration and then you
+have certain leaps, sparse leaps in between.
+So of course, like origin of life would be one, DNA, sex, eukaryotic life, the endosymbiosis
+event where the archaeon ate all bacteria, just the whole thing.
+And then of course, emergence of consciousness and so on.
+So it seems like definitely there are sparse events where a massive amount of progress
+was made, but yeah, it's kind of hard to pick one.
+So you don't think humans are unique?
+I've got to ask you, how many intelligent alien civilizations do you think are out there?
+And is their intelligence different or similar to ours?
+Yeah, I've been preoccupied with this question quite a bit recently, basically the Fermi
+paradox and just thinking through.
+And the reason actually that I am very interested in the origin of life is fundamentally trying
+to understand how common it is that there are technological societies out there in space.
+And the more I study it, the more I think that there should be quite a lot.
+Why haven't we heard from them?
+Because I agree with you.
+It feels like I just don't see why what we did here on Earth is so difficult to do.
+Yeah, and especially when you get into the details of it, I used to think origin of life
+was very, it was this magical rare event, but then you read books like, for example,
+Nic Lane, The Vital Question, Life Ascending, etc.
+And he really gets in and he really makes you believe that this is not that rare.
+Basic chemistry.
+You have an active Earth and you have your alkaline vents and you have lots of alkaline
+waters mixing with the ocean and you have your proton gradients and you have the little
+porous pockets of these alkaline vents that concentrate chemistry.
+And basically as he steps through all of these little pieces, you start to understand that
+actually this is not that crazy.
+You could see this happen on other systems.
+And he really takes you from just a geology to primitive life and he makes it feel like
+it's actually pretty plausible.
+And also like the origin of life was actually fairly fast after formation of Earth.
+If I remember correctly, just a few hundred million years or something like that after
+basically when it was possible, life actually arose.
+So that makes me feel like that is not the constraint, that is not the limiting variable
+and that life should actually be fairly common.
+And then where the drop-offs are is very interesting to think about.
+I currently think that there's no major drop-offs basically, and so there should be quite a
+lot of life.
+And basically where that brings me to then is the only way to reconcile the fact that
+we haven't found anyone and so on is that we just can't, we can't see them.
+We can't observe them.
+Just a quick brief comment.
+Nick Lane and a lot of biologists I talked to, they really seem to think that the jump
+from bacteria to more complex organisms is the hardest jump.
+The eukaryotic life basically.
+Yeah, which I don't, I get it.
+They're much more knowledgeable than me about like the intricacies of biology, but that
+seems like crazy.
+How many single cell organisms are there?
+And how much time you have?
+Surely, it's not that difficult.
+And a billion years is not even that long of a time really.
+Just all these bacteria under constrained resources battling it out.
+I'm sure they can invent more complex.
+I don't understand, it's like how to move from a hello world program to like invent a
+function or something like that.
+I don't.
+Yeah.
+So I don't, yeah, so I'm with you.
+I just feel like I don't see any, if the origin of life, that would be my intuition, that's
+the hardest thing.
+But if that's not the hardest thing, because it happens so quickly, then it's got to be
+everywhere.
+And yeah, maybe we're just too dumb to see it.
+Well, it's just, we don't have really good mechanisms for seeing this life.
+I mean, by what, how, so I'm not an expert just to preface this, but just from what I
+think about it.
+I want to meet an expert on alien intelligence and how to communicate.
+I'm very suspicious of our ability to find these intelligences out there and to find
+these earth, like radio waves, for example, are terrible.
+Their power drops off as basically one over R square.
+So I remember reading that our current radio waves would not be, the ones that we are broadcasting
+would not be measurable by our devices today.
+Only like, was it like one tenth of a light year away?
+Like not even, basically tiny distance, because you really need like a targeted transmission
+of massive power directed somewhere for this to be picked up on long distances.
+And so I just think that our ability to measure is not amazing.
+I think there's probably other civilizations out there.
+And then the big question is why don't they build binomial probes and why don't they interstellar
+travel across the entire galaxy?
+And my current answer is it's probably interstellar travel is like really hard.
+You have the interstellar medium.
+If you want to move at close to speed of light, you're going to be encountering bullets along
+because even like tiny hydrogen atoms and little particles of dust are basically have
+like massive kinetic energy at those speeds.
+And so basically you need some kind of shielding.
+You need, you have all the cosmic radiation.
+It's just like brutal out there.
+It's really hard.
+And so my thinking is maybe interstellar travel is just extremely hard.
+And you have to go very slow.
+Like billions of years to build hard?
+It feels like, it feels like we're not a billion years away from doing that.
+It just might be that it's very, you have to go very slowly potentially as an example
+through space.
+Right.
+As opposed to close to the speed of light.
+So I'm suspicious basically of our ability to measure life and I'm suspicious of the
+ability to just permeate all of space in the galaxy or across galaxies.
+And that's the only way that I can currently see around it.
+It's kind of mind blowing to think that there's trillions of intelligent alien civilizations
+out there kind of slowly traveling through space to meet each other.
+And some of them meet, some of them go to war, some of them collaborate.
+Or they're all just independent.
+They're all just like little pockets.
+Well statistically, if there's like, if it's this trillions of them, surely some of them,
+some of the pockets are close enough to get some of them happen to be close enough to
+see each other.
+And once you see, once you see something that is definitely complex life, like if we see
+something, we're probably going to be severe, like intensely aggressively motivated to figure
+out what the hell that is and try to meet them.
+But what would be your first instinct to try to like at a generational level, meet them
+or defend against them?
+Or what would be your instinct as a president of the United States and a scientist?
+I don't know which hat you prefer in this question.
+Yeah, I think the question, it's really hard.
+I will say like, for example, for us, we have lots of primitive life forms on earth next
+to us.
+We have all kinds of ants and everything else and we share space with them.
+And we are hesitant to impact on them and to, we are, we're trying to protect them by
+default because they are amazing, interesting, dynamical systems that took a long time to
+evolve and they are interesting and special.
+And I don't know that you want to destroy that by default.
+And so I like complex dynamical systems that took a lot of time to evolve.
+I think I'd like to, I like to preserve it if I can afford to.
+And I'd like to think that the same would be true about the galactic resources and that
+they would think that we're kind of incredible, interesting story that took time.
+It took a few billion years to unravel and you don't want to just destroy it.
+I could see two aliens talking about earth right now and saying, I'm a big fan of complex
+dynamical systems.
+So I think it was a value to preserve these and who basically are a video game they watch
+or show a TV show that they watch.
+Yeah, I think you would need like a very good reason, I think, to destroy it.
+Like why don't we destroy these ant farms and so on?
+Because we're not actually like really in direct competition with them right now.
+We do it accidentally and so on, but there's plenty of resources.
+And so why would you destroy something that is so interesting and precious?
+Well from a scientific perspective, you might probe it.
+You might interact with it lightly.
+You might want to learn something from it, right?
+So I wonder there could be certain physical phenomena that we think is a physical phenomena,
+but it's actually interacting with us to like poke the finger and see what happens.
+I think it should be very interesting to scientists, other alien scientists, what happened here.
+And you know, it's a, what we're seeing today is a snapshot.
+Basically it's a result of a huge amount of computation over like billion years or something
+like that.
+So it could have been initiated by aliens.
+This could be a computer running a program.
+Like when, okay, if you had the power to do this, when you, okay, for sure, at least I
+would, I would pick an earth like planet that has the conditions based on my understanding
+of the chemistry prerequisites for life and I would see it with life and run it.
+Right?
+Like, wouldn't you 100% do that and observe it and then protect?
+I mean that that's not just a hell of a good TV show.
+It's a good scientific experiment.
+And that it is it's physical simulation, right?
+Evolution is the most like actually running it, uh, is the most efficient way to, uh,
+understand computation or to compute stuff or to understand life or, you know, what life
+looks like and what branches it can take.
+It doesn't make me kind of feel weird that we're part of a science experiment, but maybe
+it's everything's a science experiment.
+So does that change anything for us for a science experiment?
+Um, I don't know.
+Two descendants of apes talking about being inside of a science experiment.
+I'm suspicious of this idea of like a deliberate panspermia as you described it, sir.
+I don't see a divine intervention in some way in the, in the historical record right
+now.
+I do feel like, um, the story in these, in these books, like Nick Lane's books and so
+on sort of makes sense.
+Uh, and it makes sense how life arose on earth uniquely.
+And uh, yeah, I don't need a, I mean, I don't need to reach for more exotic explanations
+right now.
+Sure.
+But I think that inside of video game, don't, don't, don't observe any divine intervention
+either.
+And we might just be all NPCs running a kind of code.
+Maybe eventually they will.
+Currently NPCs are really dumb, but once they're running GPTs, um, maybe they will be like,
+Hey, this is really suspicious.
+What the hell?
+So you are famously tweeted.
+It looks like if you bombard earth with photons for a while, you can emit a roadster.
+So if like in hitchhiker's guide to the galaxy, we would summarize the story of earth.
+So in that book, it's mostly harmless.
+Uh, what do you think is all the possible stories, like a paragraph long or sentence
+long that earth could be summarized as once it's done, it's computation.
+So like all the possible full, if earth is a book, right?
+Uh, probably there has to be an ending.
+I mean, there's going to be an end to earth and it could end in all kinds of ways.
+It can end soon.
+It can end later.
+What do you think are the possible stories?
+Well, definitely there seems to be, yeah, you're sort of, it's pretty incredible that
+these self replicating systems will basically arise from the dynamics and then they perpetuate
+themselves and become more complex and eventually become conscious and build a society.
+And I kind of feel like in some sense, it's kind of like a deterministic wave, uh, that,
+you know, that kind of just like happens on any, you know, any sufficiently well-arranged
+system like earth.
+And so I kind of feel like there's a certain sense of inevitability in it.
+Um, and it's really beautiful.
+And it ends somehow, right?
+So it's a, it's a chemically a diverse environment where complex dynamical systems can evolve
+and become more, more further and further complex.
+But then there's a certain, um, what is it?
+There's certain terminating conditions.
+Yeah, I don't know what the terminating conditions are, but definitely there's a trend line of
+something and we're part of that story.
+And like, where does that, where does it go?
+So you know, we're famously described often as a biological bootloader for AIs and that's
+because humans, I mean, you know, we're an incredible, uh, biological system and we're
+and, uh, you know, and love and so on.
+Um, but we're extremely inefficient as well.
+Like we're talking to each other through audio.
+It's just kind of embarrassing, honestly, that we're manipulating like seven symbols,
+uh, serially, we're using vocal cords.
+It's all happening over like multiple seconds.
+It's just like kind of embarrassing when you step down to the frequencies at which computers
+operate or are able to cooperate on.
+So basically it does seem like, um, synthetic intelligences are kind of like the next stage
+of development.
+And um, I don't know where it leads to.
+Like at some point I suspect, uh, the universe is some kind of a puzzle and these, uh, synthetic
+AIs will uncover that puzzle and, um, solve it.
+And then what happens after, right?
+Like what, cause if you just like fast forward earth, many billions of years, it's like,
+it's quiet and then it's like to turmoil.
+You see like city lights and stuff like that.
+And then what happens at like, at the end, like, is it like a poof?
+It's it, or is it like a calming, is it explosion?
+Is it like earth like open, like a giant, cause you said, um, it roasters like, well,
+let's start emitting like, like a giant number of like satellites.
+Yes.
+It's some kind of a crazy explosion and we're living, we're like, we're stepping
+through a explosion and we're like living day to day and it doesn't look like it, but
+it's actually, if you, I saw a very cool animation of earth, uh, and life on earth and basically
+nothing happens for a long time.
+And then the last like two seconds, like basically cities and everything and just in the lower
+earth orbit just gets cluttered and just the whole thing happens in the last two seconds.
+And you're like, this is exploding.
+This is a state explosion.
+So if you play, yeah, yeah.
+If you play it at normal speed, it'll just look like an explosion.
+It's a firecracker.
+We're living in a firecracker.
+Where it's going to start emitting all kinds of interesting things.
+Yeah.
+And then the, so explosion doesn't, it might actually look like a little explosion with,
+with lights and fire and energy emitted, all that kind of stuff.
+But when you look inside the details of the explosion, there's actual complexity
+happening where there's like, uh, yeah, human life or some kind of life.
+We hope it's not a destructive firecracker.
+It's kind of like a constructive firecracker.
+All right.
+So given that, I think, uh, hilarious discussion.
+It is really interesting to think about like what the puzzle of the universe is.
+Did the creator of the universe, uh, give us a message?
+Like for example, in the book, contact, um, Carl Sagan, uh, there's a message for
+humanity, for any civilization in, uh, digits in the expansion of PI in base 11,
+eventually, which is kind of interesting thought, uh, maybe, maybe we're supposed
+to be giving a message to our creator.
+Maybe we're supposed to somehow create some kind of a quantum mechanical system
+that alerts them to our intelligent presence here.
+Cause if you think about it from their perspective, it's just say like quantum
+field theory, massive, like cellular, ton of a ton like thing.
+And like, how do you even notice that we exist?
+You might not even be able to pick us up in that simulation.
+And so how do you, uh, how do you prove that you exist, uh, that you're
+intelligent and that you're part of the universe?
+So this is like a touring test for intelligence from earth.
+Yeah.
+I got the creator's, uh, I mean, maybe this is like trying to complete
+the next word in a sentence.
+This is a complicated way of that.
+Like earth is just, is basically sending a message back.
+Yeah.
+The puzzle is basically like alerting the creator that we exist.
+Uh, or maybe the puzzle is just to just break out of the system and just, uh,
+you know, uh, stick it to the creator in some way.
+Uh, basically, like if you're playing a video game, you can, um, you can somehow
+find an exploit and find a way to execute on the host machine, uh, in the arbitrary
+code, uh, there's some, uh, for example, I believe someone got a Mario, a game of
+Mario to play pong just by, um, exploiting it and then, um, creating, uh,
+basically writing, writing code and being able to execute arbitrary code in the
+game.
+And so maybe we should be, maybe that's the puzzle is that we should be, um, uh,
+find a way to exploit it.
+So, so I think like some of these synthetic guys will eventually find the
+universe to be some kind of a puzzle and then solve it in some way.
+And that's kind of like the end game somehow.
+Do you often think about it as a, as a simulation?
+So, uh, as, or the universe being a kind of computation that has, might have bugs
+and exploits.
+Yes.
+Yeah, I think so.
+I said, well, physics is essentially, I think it's possible that physics has
+exploits and we should be trying to find them, uh, arranging some kind of a crazy
+quantum mechanical system that somehow gives you buffer overflow, uh, somehow
+gives you a rounding error in the floating point.
+Uh, uh, yeah, that's right.
+And we're like more and more sophisticated exploits.
+Those are jokes, but that could be actually very close.
+Yeah.
+We'll find some way to extract infinite energy.
+Uh, for example, when you train a reinforcement learning agents, um, in
+physical simulations and you ask them to say, run quickly on the flat ground,
+they'll end up doing all kinds of like weird things, um, in part of that
+optimization, right?
+They'll get on their back leg and they'll slide across the floor.
+And it's because the optimization, um, the enforcement learning optimization on
+that agent has figured out a way to extract infinite energy from the friction
+forces and, um, basically their poor implementation.
+And, uh, they found a way to generate infinite energy and just slide across the
+surface and it's not what you expected.
+It's just, uh, it's sort of like a perverse solution.
+And so maybe we can find something like that.
+Maybe we can be that little dog in this physical simulation.
+The cracks or escapes the intended consequences of the physics that the
+universe came up with will figure out some kind of shortcut to some weirdness.
+Yeah.
+And then, man, but see the problem with that weirdness is the first person to
+discover the weirdness, like sliding in the back legs.
+That's all we're going to do.
+Yeah.
+It's very quickly become everybody does that thing.
+So like the paperclip maximizer is a ridiculous idea, but that very well could
+be what then we'll just, uh, we'll just all switched that cause it's so fun.
+Well, no person will discover it.
+I think, by the way, I think it's going to have to be, uh, some kind of a super
+intelligent AGI of a third generation.
+Like we're building the first generation AGI.
+And then, you know, third generation.
+Yeah.
+So the, the bootloader for an AI, the, that AI will be a
+bootloader for another AI.
+Yeah.
+And then there's no way for us to introspect like what that might even, uh,
+I think it's very likely that these things, for example, like, say you have
+these AGI's it's very likely that, for example, they will be completely inert.
+I like these kinds of sci-fi books sometimes where these things are just
+completely inert, they don't interact with anything.
+And I find that kind of beautiful because, uh, they probably, uh, they've
+probably figured out the meta meta game of the universe in some way, potentially
+there, they're doing something completely beyond our imagination.
+Um, and, uh, they don't interact with simple chemical life forms.
+Like, why would you do that?
+So I find those kinds of ideas compelling.
+What's their source of fun?
+What are they, what are they doing?
+What's the source of pleasure solving in the universe, but in there.
+So can you define what it means inert?
+So they escape the interaction.
+As in, um, uh, they will behave in some very strange way to us, uh, because
+they're, uh, they're beyond, they're playing the meta game, uh, and the meta
+game is probably say like arranging quantum mechanical systems and some very
+weird ways to extract infinite energy, uh, solve the digital expansion of
+pie to whatever amount, uh, they will build their own like little fusion
+reactors or something crazy, like they're doing something beyond comprehension
+and not understandable to us and actually brilliant under the hood.
+What if quantum mechanics itself is the system and we're just thinking it's
+physics, but we're really parasites on, on, not parasite, we're not really
+hurting physics, we're just living on this organisms, this organism, and
+we're like trying to understand it, but really it is an organism and with
+a deep, deep intelligence, maybe physics itself is, uh, the, the, the organism
+that's doing the super interesting thing.
+And we're just like one little thing, yeah.
+And sitting on top of it, trying to get energy from it.
+We're just kind of like these particles in the wave that I feel like is mostly
+deterministic and takes a universe from some kind of a big bang to some kind
+of a super intelligent replicator, some kind of a stable point in the universe.
+Given these laws of physics, you don't think, uh, as Einstein said, God
+doesn't play dice, so you think it's mostly deterministic.
+There's no randomness in the thing.
+I think it's deterministic.
+Oh, there's tons of, uh, well, I'm, I'm, I want to be careful with randomness.
+Pseudo random.
+Yeah.
+I don't like random.
+Uh, I think maybe the laws of physics are deterministic.
+Um, yeah, I think they're deterministic.
+You just got really uncomfortable with this question.
+I just, do you have anxiety about whether the universe is random or not?
+Is this a, what's, there's no randomness.
+It's, uh, you said you like goodwill hunting.
+It's not your fault, Andre.
+It's not your fault, man.
+Um, so you don't like randomness.
+Uh, yeah, I think it's, uh, unsettling.
+I think it's a deterministic system.
+I think that things that look random, like say the, uh, collapse of the wave
+function, et cetera, I think they're actually deterministic, just entanglement,
+uh, and so on and, uh, some kind of a multiverse theory, something, something.
+Okay.
+So why does it feel like we have a free will?
+Like if I, if I raised his hand, I chose to do this now.
+Um, what that doesn't feel like a deterministic thing.
+It feels like I'm making a choice.
+It feels like it.
+Okay.
+So it's all feelings.
+It's just feelings.
+Yeah.
+So when RL agent is making a choice, is that, um, it's not really
+making a choice.
+The choice was all already there.
+Yeah.
+You're interpreting the choice and you're creating a narrative for, for having made it.
+Yeah.
+And now we're talking about the narrative.
+It's very meta looking back.
+What is the most beautiful or surprising idea in deep learning or AI in general
+that you've come across?
+You've seen this field explode, uh, and grow in interesting ways.
+Just what, what cool ideas like, like we made you sit back and go,
+small, big or small.
+Well, the one that I've been thinking about recently, the most probably is the,
+the transformer architecture.
+Um, so basically, uh, neural networks have, uh, a lot of architectures that were
+trendy have come and gone for different sensory modalities, like for vision,
+audio, text, you would process them with different looking neural nets.
+And recently we've seen these, this convergence towards one architecture,
+the transformer, and, uh, you can feed it video or you can feed it, you know,
+images or speech or text, and it just gobbles it up and it's kind of like
+a bit of a general purpose, uh, computer.
+There's also trainable and very efficient to run on our hardware.
+And so, uh, this paper came out in 2016.
+I want to say, um, attention is all you need.
+Attention is all you need.
+You criticize the paper title in retrospect that it wasn't, um, it didn't
+foresee the bigness of the impact that it was going to have.
+Yeah.
+I'm not sure if the authors were aware of the impact that that paper would go
+on to have, probably they weren't, but I think they were aware of some of the
+motivations and design decisions behind the transformer and they chose not to,
+I think, uh, expand on it in that way in the paper.
+And so I think they had an idea that there was more, um, than just the
+surface of just like, Oh, we're just doing translation and here's a better
+architecture.
+You're not just doing translation.
+This is like a really cool, differentiable, optimizable, efficient
+computer that you've proposed.
+And maybe they didn't have all of that foresight, but I think it's really
+interesting.
+Isn't it funny, sorry to interrupt that that title is memeable that they went
+for such a profound idea.
+They went with a, I don't think anyone used that kind of title before, right?
+Attention is all you need.
+Yeah.
+It's like a meme or something.
+Yeah.
+It's not funny that one, like, uh, maybe if it was a more serious title, it
+wouldn't have the impact.
+Honestly, I, yeah, there is an element of me that honestly agrees with you and
+prefers it this way.
+Yes.
+Uh, if it was too grand, it would over promise and then under deliver
+potentially.
+So you want to just, uh, meme your way to greatness.
+That should be a t-shirt.
+So you, you tweeted the transformer is a magnificent neural network architecture
+because it is a general purpose, differentiable computer.
+It is simultaneously expressive in the forward pass, optimizable via back
+propagation, gradient descent, and efficient high parallelism compute graph.
+Can you discuss some of those details, expressive, optimizable, efficient
+for memory or, or in general, whatever comes to your heart?
+You want to have a general purpose computer that you can train on arbitrary
+problems, uh, like say the task of next work prediction or detecting if there's
+a cat in a image or something like that.
+And you want to train this computer.
+So you want to set its, its weights.
+And I think there's a number of design criteria that sort of overlap in the
+transformer simultaneously that made it very successful.
+And I think the authors were kind of, uh, deliberately trying to, uh, make
+this really, uh, powerful architecture.
+And, um, so basically it's very powerful in the forward pass because it's able
+to express, um, very general computation as sort of something that looks like
+message passing, uh, you have nodes and they all store vectors and, uh, these
+nodes get to basically look at each other and it's, uh, each other's vectors
+and they get to communicate and basically nodes get to broadcast, Hey,
+I'm looking for certain things.
+And then other nodes get to broadcast.
+Hey, these are the things I have.
+Those are the keys and the values.
+So it's not just the tension.
+Yeah, exactly.
+Transformers much more than just the attention component.
+It's got many pieces architectural that went into it.
+The residual connection of the weights arranged, there's a multi-layer perceptron
+and they're the weights stacked and so on.
+Um, but basically there's a message passing scheme where nodes get to look at
+each other, decide what's interesting and then update each other.
+And, uh, so I think the, um, when you get to the details of it, I think
+it's a very expressive function.
+Uh, so it can express lots of different types of algorithms and forward pass.
+Not only that, but the way it's designed with the residual connections,
+layer normalizations, the soft max attention and everything.
+It's also optimizable.
+This is a really big deal because there's lots of computers that are
+powerful that you can't optimize.
+Um, or they're not easy to optimize using the techniques that we have,
+which is backpropagation and gradient and sent.
+These are first order methods, very simple optimizers really.
+And so, um, you also need it to be optimizable.
+Um, and then lastly, you want it to run efficiently in our hardware.
+Our hardware is a massive throughput machine, like GPUs.
+Uh, they prefer lots of parallelism.
+So you don't want to do lots of sequential operations.
+So you want to do a lot of operations serially and the transformer is designed
+with that in mind as well.
+And so it's designed for our hardware and it's designed to both be very
+expressive in a forward pass, but also very optimizable in the backward pass.
+And you said that, uh, the residual connections support a kind of ability
+to learn short algorithms fast and first, and then gradually extend them,
+uh, longer during training.
+What's, what's the idea of learning short algorithms?
+Right.
+Think of it as a, so basically a transformer is a, uh, series of, uh,
+blocks, right?
+And these blocks have attention and a little multilayer perceptual.
+And so you, you go off into a block and you come back to this residual pathway.
+And then you go off and you come back and then you have a number
+of layers arranged sequentially.
+And so the way to look at it, I think is, uh, because of the residual
+pathway in the backward pass, the gradients, uh, sort of flow along it uninterrupted
+because addition, uh, distributes the gradient equally to all of its branches.
+So the gradient from the supervision at the top, uh, just floats
+directly to the first layer.
+And the, all the residual connections are arranged so that in the beginning
+at during initialization, they contribute nothing to the residual pathway.
+Um, so what it kind of looks like is imagine the transformer is kind of
+like a, uh, Python, uh, function, like a death.
+And, um, you get to do various kinds of like lines of code.
+Uh, say you have a hundred layers, deep, uh, transformer, typically
+they would be much shorter, say 20.
+So if 20 lines of code, then you can do something in them.
+And so think of during the optimization, basically what it looks like is first
+you optimize the first line of code and then the second line of code can kick
+in and the third line of code can kick in.
+And I kind of feel like because of the residual pathway and the dynamics of
+the optimization, uh, you can sort of learn a very short algorithm that
+gets the approximate answer, but then the other layers can sort of kick in and
+start to create a contribution.
+And at the end of it, you're, you're optimizing over an algorithm
+that is a 20 lines of code.
+Except these lines of code are very complex because it's an
+entire block of a transformer.
+You can do a lot in there.
+Well, it's really interesting is that this transformer architecture
+actually has been a remarkably resilient.
+Basically the transformer that came out in 2016 is the transformer
+you would use today, except you reshuffle some of the layer norms.
+Uh, the layer normalizations have been reshuffled to a pre-norm, um, formulation.
+And so it's been remarkably stable, but there's a lot of bells and whistles
+that people have attached on it and try to, uh, improve it.
+I do think that basically it's a, it's a big step in simultaneously optimizing
+for lots of properties of a desirable neural network architecture.
+And I think people have been trying to change it, but it's proven
+remarkably resilient.
+Um, but I do think that there should be even better architectures potentially.
+But it's, uh, you're, you admire the resilience here.
+Yeah.
+There's something profound about this architecture that, that least
+resilient, so maybe we can, everything can be turned into a, uh, into a problem
+that transformers can solve.
+Currently definitely looks like the transformer is taking over AI and you
+can feed basically arbitrary problems into it.
+And it's a general, the French double computer and it's extremely powerful.
+And, uh, at this conversions in AI has been, uh, really interesting
+to watch, uh, for me personally.
+What else do you think could be discovered here about transformers?
+Like what's surprising thing or, or is it a stable, um, I want a stable place.
+Is there something interesting we might discover about transformers?
+Like aha moments maybe has to do with memory.
+Um, maybe knowledge representation, that kind of stuff.
+Definitely does that guys today is just pushing like basically right now, the
+side guys is do not touch the transformer, touch everything else.
+Yes.
+So people are scaling up the data sets, making them much, much bigger.
+They're working on the evaluation, making the evaluation much, much bigger.
+And, uh, um, they're basically keeping the architecture unchanged.
+And that's how we've, um, that's the last five years of progress in AI kind of.
+What do you think about one flavor of it, which is language models?
+Have you been surprised?
+Uh, has your sort of imagination been captivated by you mentioned
+GPT and all the bigger and bigger and bigger language models.
+And, uh, what are the limits of those models do you think?
+So just for the task of natural language.
+Basically the way GPT is trained, right.
+Is you just download a massive amount of text data from the internet and that you
+try to predict the next word in a sequence, roughly speaking, you're
+predicting little work chunks, but, uh, roughly speaking, that's it.
+Um, and what's been really interesting to watch is, uh, basically it's a language
+model, language models have actually existed for a very long time.
+Um, there's papers on language modeling from 2003, even earlier.
+Can you explain that case?
+What a language model is?
+Uh, yeah.
+So language model just, uh, basically the rough idea is, um, just predicting
+the next word in a sequence, roughly speaking.
+Uh, so there's a paper from, for example, Ben Geo, uh, and the team from 2003,
+where for the first time they were using a neural network to take, say like three
+or five words and predict the, um, next word, and they're doing this on much
+smaller datasets and the neural net is not a transformer, it's a multi-layer
+perceptron, but it's the first time that a neural network has been applied in
+that setting, but even before neural networks, there were, um, language models,
+except they were using, um, Ngram models.
+So Ngram models are just, uh, count based models.
+So, um, if you try to, if you start to take two words and predict the third
+one, you just count up how many times you've seen any, uh, two word combinations
+and what came next and what you predict as coming next is just what you've seen
+the most of in the training set.
+And so, uh, language modeling has been around for a long time.
+Neural networks have done language modeling for a long time.
+So really what's new or interesting or exciting is just realizing that when you
+scale it up, uh, with a powerful enough neural net, a transformer, you have all
+these emergent properties where, uh, basically what happens is if you have a
+large enough dataset of text, you are in the task of predicting the next word.
+You are multitasking a huge amount of different kinds of problems.
+You are multitasking, understanding of, you know, chemistry, physics, human
+nature, lots of things are sort of clustered in that objective.
+It's a very simple objective, but actually you have to understand
+a lot about the world to make that prediction.
+You just said the U word understanding, uh, are you in terms of chemistry and
+physics and so on, what do you feel like it's doing?
+Is it searching for the right context?
+Uh, in, in like, what is it, what is the actual process happening here?
+Yeah.
+So basically it gets a thousand words and it's trying to predict the thousand and
+first, and, uh, in order to do that very, very well over the entire dataset
+available on the internet, you actually have to basically kind of understand
+the context of, of what's going on in there.
+Yeah.
+Um, and, uh, it's a sufficiently hard problem that you, uh, if you have a
+powerful enough computer, like a transformer, you end up with a interesting
+solutions and, uh, you can ask it to do all kinds of things and, um, it, it
+shows a lot of, uh, emergent properties, like in context learning.
+That was the big deal with GPT and the original paper when they published it
+is that you can just sort of, uh, prompt it in various ways and ask it to do
+various things and it will just kind of complete the sentence, but in the process
+of just completing the sentence, it's actually solving all kinds of really,
+uh, interesting problems that we care about.
+Do you think it's doing something like understanding?
+Like when we use the word understanding for us humans, I think it's doing some
+understanding in its weights, it understands, I think a lot about the world
+and it has to, in order to predict the next word in the sequence.
+So it's trained on the data from the internet.
+Uh, what do you think about this, this approach in terms of data sets
+of using data from the internet?
+Do you think the internet has enough structured data to teach
+AI about human civilization?
+Yes.
+So I think the internet has a huge amount of data.
+I'm not sure if it's a complete enough set.
+I don't know that, uh, text is enough for having a sufficiently
+powerful AGI as an outcome.
+Um, of course there is audio and video and images and all that.
+Yeah.
+Kind of stuff.
+Yeah.
+So text by itself, I'm a little bit suspicious about.
+There's a ton of things we don't put in text in writing, uh, just
+because they're obvious to us about how the world works and the physics of it.
+And that things fall, we don't put that stuff in text because why would you,
+we share that understanding.
+And so text is a communication medium between humans and it's not a, uh, all
+encompassing medium of knowledge about the world, but as you pointed out,
+we do have video and we have images and we have audio.
+And so I think that, uh, that definitely helps a lot, but we haven't
+trained models, uh, sufficiently, uh, across both across all of those modalities yet.
+Uh, so I think that's what a lot of people are interested in.
+But I wonder what that shared understanding of like what we might call common
+sense has to be learned, inferred in order to complete the sentence correctly.
+So maybe the fact that it's implied on the internet, the model is going
+to have to learn that not by reading about it, by inferring it in the representation.
+So like common sense, just like we, I don't think we learn common sense.
+Like nobody says, tells us explicitly.
+We just figure it all out by interacting with the world.
+And so here's a model of reading about the way people interact with the world.
+It might have to infer that.
+I wonder, uh, you, you briefly worked on a project called the world of bits,
+training and RRL system to take actions on the internet, um, versus just consuming
+the internet, like we talked about.
+Do you think there's a future for that kind of system interacting with
+the internet to help the learning?
+Yes.
+I think that's probably the, uh, the final frontier for a lot of these
+models, uh, because, um, so as you mentioned, when I was at open AI, I was
+working on this project for a little bit.
+And basically it was the idea of giving neural networks access to a keyboard
+and a mouse and the idea possibly go wrong.
+So basically you, um, you perceive the input of the, uh, screen pixels.
+And, uh, basically the state of the computer is sort of visualized, uh, for
+human consumption in images of the web browser and stuff like that.
+And then you give them your own or the ability to press keyboards and use the
+mouse and we're trying to get it to, for example, complete bookings and, you
+know, interact with user interfaces.
+And, um,
+what'd you learn from that experience?
+Like, what was some fun stuff?
+This is a super cool idea.
+Yeah.
+I mean, it's like, uh, yeah, I mean, the, the step between observer to actor
+is a super fascinating step.
+Yeah.
+Well, it's the universal interface in the digital realm, I would say.
+And, uh, there's a universal interface in like the physical realm, which in my
+mind is a humanoid form factor kind of thing.
+Uh, we can later talk about optimists and so on, but I feel like there's a, uh,
+they're kind of like a similar philosophy in some way where the human, the world,
+the physical world is designed for the human form and the digital world is
+designed for the human form of seeing the screen and using keyword, not
+keyboard and mouse.
+And so it's the universal interface that can basically, uh, command the digital
+infrastructure we've built up for ourselves.
+And so it feels like a very powerful interface to, to command and to build on
+top of, uh, now to your question as to like what I learned from that, it's
+interesting because the world of bits was basically too early, I think at
+open AI at the time, um, this is around 2015 or so, and the zeitgeist at that
+time was very different in AI from the zeitgeist today at the time, everyone
+was super excited about reinforcement learning from scratch.
+Uh, this is the time of the Atari paper, uh, where, uh, neural networks were
+playing Atari games, um, and beating humans in some cases, uh, AlphaGo and so on.
+So everyone's very excited about training neural networks from scratch
+using reinforcement learning, um, directly.
+It turns out that reinforcement learning is extremely inefficient way of training
+neural networks because you're taking all these actions and all these
+observations and you get some sparse rewards once in a while.
+So you do all this stuff based on all these inputs and once in a while,
+you're like told you did a good thing, you did a bad thing.
+And it's just an extremely hard problem.
+You can't learn from that.
+Uh, you can burn a forest and you can sort of brute force through it.
+And we saw that I think with, uh, you know, with, uh, go and
+Dota and so on and does work.
+Uh, but it's extremely inefficient, uh, and, uh, not how you want to
+approach problems, uh, practically speaking.
+And so that's the approach that at the time we also took to world of bits.
+Uh, we would, uh, have an agent initialize randomly.
+So with keyboard mash and mouse mash and try to make a booking.
+And it's just like revealed the insanity of that approach very quickly,
+where you have to stumble by the correct booking in order to get a reward of
+you did it correctly and you're never going to stumble by it by chance at random.
+So even with a simple web interface, there's too many options.
+There's just too many options.
+Uh, and, uh, it's too sparse of a reward signal and you're
+starting from scratch at the time.
+And so you don't know how to read.
+You don't understand pictures, images, buttons.
+You don't understand what it means to like make a booking, but now what's
+happened is, uh, it is time to revisit that and open AI is interested in this.
+Uh, companies like adept are interested in this and so on.
+And, uh, the idea is coming back, uh, because the interface is very powerful,
+but now you're not training an agent from scratch.
+You are taking the GPT as an initialization.
+So GPT is pre-trained on all of.
+Text and it understands what's a booking.
+It understands what's a submit.
+It understands, um, quite a bit more.
+And so it already has those representations.
+They are very powerful.
+And that makes all the training significantly more efficient, um,
+and makes the problem tractable.
+Should the interaction be with like the way humans see it with the buttons and
+the language, or should be with the HTML, JavaScript and the CSS?
+What's, what do you think is the better?
+Uh, so today all of this interaction is mostly on the level of HTML, CSS,
+and so on that's done because of computational constraints.
+Uh, but I think ultimately, um, uh, everything is designed for human
+visual consumption and so at the end of the day, there's all the additional
+information is in the layout of the webpage and what's next to you and
+what's a red background and all this kind of stuff and what it looks like visually.
+So I think that's the final frontier as we are taking in a pixels and we're
+giving out keyboard mouse commands.
+Uh, but I think it's impractical still today.
+Do you worry about bots on the internet?
+Given, given these ideas, given how exciting they are, do you worry about
+bots on Twitter being not the stupid boss that we see now with the crypto
+bots, but the bots that might be out there actually that we don't see that
+they're interacting in interesting ways.
+So this kind of system feels like it should be able to pass the, I'm not a
+robot click button, whatever.
+Um, which does she understand how that test works?
+I don't quite like, uh, there's, there's a, there's a checkbox or
+whatever that you click is presumably tracking like mouse movement and
+the timing and so on.
+So exactly this kind of system we're talking about should be able to pass that.
+So w yeah, what do you feel about, um, bots that are language models plus have
+some interact ability and are able to tweet and reply and so on, do you worry
+about that world?
+Uh, yeah, I think it's always been a bit of an arms race, uh, between sort
+of the attack and the defense.
+Uh, so the attack will get stronger, but the defense will get stronger as well.
+Uh, our ability to detect that.
+How do you defend, how do you detect, how do you know that your Carpati
+account on Twitter is, is human?
+How would you approach that?
+Like if people were claimed, you know, uh, how would you defend yourself in
+the court of law that I'm a human?
+Um, this account is, yeah, at some point, I think, uh, it might be, I think
+the society, the society will evolve a little bit, like we might start signing
+digitally, signing, uh, some of our correspondence or, you know, things that
+we create, uh, right now it's not necessary, but maybe in the future it
+might be, I do think that we are going towards a world where we share, we
+share the digital space with, uh, AIs.
+Synthetic beings.
+Yeah.
+And, uh, they will get much better and they will share our digital realm and
+they'll eventually share our physical realm as well.
+It's much harder.
+Uh, but that's kind of like the world we're going towards and most of them
+will be benign and awful and some of them will be malicious and it's going to be
+an arms race trying to detect them.
+So, I mean, the worst isn't the AIs.
+The worst is the AIs pretending to be human.
+So mine, I don't know if it's always malicious.
+There's obviously a lot of malicious applications, but it could also be, you
+know, if I was an AI, I would try very hard to pretend to be human because we're
+in a human world.
+I wouldn't get any respect as an AI.
+I want to get some love and respect.
+I don't think the problem is intractable.
+People are, people are thinking about the proof of personhood and, uh, we
+might start digitally signing our stuff and we might all end up having like, uh,
+yeah, basically some, some solution for proof of personhood.
+It doesn't seem to me intractable.
+It's just something that we haven't had to do until now, but I think once the
+need like really starts to emerge, which is soon, I think people will think
+about it much more.
+So, but that too will be a race because, um, obviously you can probably, uh,
+spoof or fake the, the, the proof of, uh, personhood.
+So you have to try to figure out how to, I mean, it's weird that we have like
+social security numbers and like passports and stuff.
+It seems like it's harder to fake stuff in the physical space.
+In the digital space, it just feels like it's going to be very tricky, very
+tricky to out, um, cause it seems to be pretty low cost to fake stuff.
+What are you going to put an AI in jail for like trying to use a fake, uh,
+fake personhood proof?
+You can, I mean, okay, fine.
+You'll put a lot of AIs in jail, but there'll be more as arbitrary, like
+exponentially more the cost of creating a bot is very low.
+Unless there's some kind of way to track accurately, like you're not allowed to
+create any program without showing, uh, tying yourself to that program.
+Like you, any program that runs on the internet, you'll be able to, uh, trace
+every single human program and those involved with that program.
+Yeah, maybe you have to start declaring when, uh, you know, we have to start
+drawing those boundaries and keeping track of, okay, uh, what are digital
+entities versus human entities and, uh, what is the ownership of human entities
+and digital entities and, uh, something like that, um, I don't know, but I'm,
+I think I'm optimistic that this is, uh, this is, uh, possible and it's some, in
+some sense, we're currently in like the worst time of it because, um, all these
+bots suddenly have become very capable, uh, but we don't have the fences yet
+built up as a society and, but I think, uh, that doesn't seem to me intractable.
+It's just something that we have to deal with.
+It seems weird that the Twitter bot, like really crappy Twitter bots are so
+numerous, like is it, so I presume that the engineers at Twitter are very good.
+So it seems like what I would infer from that, uh, is it seems like a hard problem.
+It, they're probably catching, right.
+If I were to sort of steal man, the case, it's a hard problem and there's a
+huge cost to, uh, false positive to, to removing a post by somebody that's not a
+bot that creates a very bad user experience.
+So they're very cautious about removing.
+So maybe it's, um, and maybe the bots are really good at learning what gets
+removed and not such that they can stay ahead of the removal process very quickly.
+My impression of it honestly is, uh, there's a lot of loaning fruit.
+I mean, yeah, just that's what I, it's not subtle.
+My impression of it.
+It's not subtle, but you have to, yeah, that's my impression as well, but it
+feels like maybe you're seeing the, the tip of the iceberg, maybe the number of
+bots isn't like the trillions and you have to like, yeah, just, it's a
+constant assault of bots and you, yeah, I don't know, um, you have to steal man
+in the case, cause the bots I'm seeing are pretty like obvious.
+I could write a few lines of code that catch these spots.
+I mean, definitely there's a lot of loaning fruit, but I will say, I agree
+that if you are a sophisticated actor, you could probably create a pretty good
+bot right now, um, you know, using tools like GPTs, uh, because it's a language
+model, you can generate faces that look quite good now, uh, and you can do this
+at scale.
+And so I think, um, yeah, it's quite plausible and it's going to be hard to defend.
+There was a Google engineer that claimed that, uh, Lambda was sentient.
+Do you think there's any inkling of truth to what he felt?
+And more importantly, to me, at least, do you think language models will achieve
+sentience or the illusion of sentience soonish?
+Yeah, to me, it's a little bit of a canary in a coal mine kind of moment,
+honestly, a little bit, uh, because, uh, so this engineer spoke to like a chat
+bot at Google and, uh, became convinced that, uh, this bot is sentient.
+He asked us some existential philosophical questions and gave like
+reasonable answers and looked real and, uh, and so on.
+Uh, so to me, it's a, uh, he was, he was, uh, he wasn't sufficiently trying to
+stress the system, I think, and, uh, exposing the truth of it as it is today.
+Um, but, uh, I think this will be increasingly harder over time.
+Uh, so, uh, yeah, I think more and more people will basically, uh, become, um,
+yeah, I think more and more, there'll be more people like that over time.
+As, as this gets better, like form an emotional connection to an AI.
+Plausible in my mind.
+I think these AIs are actually quite good at human, human connection, human
+emotion, a ton of text on the internet is about humans and connection and love
+and so on, so I think they have a very good understanding in some, in some sense
+of, of how people speak to each other about this and, um, they're very capable
+of creating a lot of that kind of text.
+The, um, there's a lot of like sci-fi from fifties and sixties that imagined
+AIs in a very different way.
+They are calculating cold Vulcan like machines.
+That's not what we're getting today.
+We're getting pretty emotional AIs that actually, uh, are very competent and
+capable of generating, you know, plausible sounding text with respect to all of
+these topics.
+See, I'm really hopeful about AI systems that are like companions that help you
+grow, develop as a human being, uh, help you maximize long-term happiness.
+But I'm also very worried about AI systems that figure out from the
+internet, the humans get attracted to drama.
+And so these would just be like shit talking AIs.
+They just constantly, did you hear it?
+Like they'll do gossip.
+They'll do, uh, they'll try to plant seeds of suspicion to other humans that
+you love and trust and, uh, just kind of mess with people, uh, in the, you know,
+cause, cause that's going to get a lot of attention to drama, maximize drama on
+the path to maximizing, uh, engagement and us humans will feed into that machine
+and get, it'll be a giant drama shit storm.
+Uh, so I'm worried about that.
+So it's the objective function really defines the way that human civilization
+progresses with AIs in it.
+I think right now, at least today, they are not sort of, it's not correct to
+really think of them as goal seeking agents that want to do something.
+They have no long-term memory or anything.
+They it's literally a good approximation of it is you get a thousand words and
+you're trying to predict a thousand at first, and then you continue feeding it
+in and you are free to prompt it in whatever way you want.
+So in text, so you say, okay, you are a psychologist and you are very good
+and you love humans and here's a conversation between you and another human.
+Human colon, something you something, and then it just continues the pattern.
+And suddenly you're having a conversation with a fake psychologist
+who's like trying to help you.
+And so it's still kind of like in a realm of a tool is a, um, people can prompt
+it in arbitrary ways and it can create really incredible text, but it doesn't
+have long-term goals over long periods of time.
+It doesn't try to, uh, so it doesn't look that way right now.
+Yeah, but you can do short-term goals that have long-term effects.
+So if my prompting short-term goal is to get Andre Capati to respond to me on
+Twitter, whenever, like I think AI might that's the goal, but it might figure out
+that talking shit to you, it would be the best in a highly sophisticated, interesting
+way.
+And then you build up a relationship when you were spelling once and then it
+like over time it gets to not be sophisticated and just like just
+talk shit.
+And okay, maybe you won't get to Andre, but it might get to another
+celebrity, it might get into other big accounts and then it'll just, so with
+just that simple goal, get them to respond, maximize the probability of
+actual response.
+Yeah.
+I mean, you could prompt a powerful model like this with their, it's opinion
+about how to do any possible thing you're interested in.
+So they will just, they're kind of on track to become these oracles.
+I could sort of think of it that way.
+They are oracles.
+Currently it's just text, but they will have calculators.
+They will have access to Google search.
+They will have all kinds of gadgets and gizmos.
+They will be able to operate the internet and find different information.
+And yeah, in some sense, that's kind of like currently what it looks like in
+terms of the development.
+Do you think it'll be an improvement eventually over what Google is for access
+to human knowledge?
+Like it'll be a more effective search engine to access human knowledge.
+I think there's definite scope in building a better search engine today.
+And I think Google, they have all the tools, all the people, they have
+everything they need, they have all the puzzle pieces, they have people training
+transformers at scale, they have all the data.
+It's just not obvious if they are capable as an organization to innovate on their
+search engine right now.
+And if they don't, someone else will.
+There's absolute scope for building a significantly better search engine
+built on these tools.
+It's so interesting.
+A large company where the search, there's already an infrastructure.
+It works as brings out a lot of money.
+So where structurally inside a company is their motivation to pivot?
+To say, we're going to build a new search engine.
+Yeah, that's hard.
+So it's usually going to come from a startup, right?
+That's that would be, yeah.
+Or some other more competent organization.
+So I don't know.
+So currently, for example, maybe Bing has another shot at it.
+You know, as an example.
+Microsoft Edge, we're talking offline.
+I mean, it definitely is really interesting because search engines used to be about,
+OK, here's some query.
+Here's here's here's web pages that look like the stuff that you have.
+But you could just directly go to answer and then have supporting evidence.
+And these these models, basically, they've read all the text and they've read all the
+web pages.
+And so sometimes when you see yourself going over to search results and sort of getting
+like a sense of like the average answer to whatever you're interested in, like that just
+directly comes out.
+You don't have to do that work.
+So they're kind of like.
+Yeah, I think they have a way to this of distilling all that knowledge into.
+Like some level of insight, basically.
+Do you think of prompting as a kind of teaching and learning like this whole process,
+like another layer?
+You know, because maybe that's what humans are.
+We already have that background model and you're the world is prompting you.
+Yeah, exactly.
+I think the way we are programming these models is that we're trying to make it
+like computers now like GPT's is converging to how you program humans.
+I mean, how do I program humans via prompt?
+I go to people and I prompt them to do things.
+I prompt them from information.
+And so natural language prompt is how we program humans.
+And we're starting to program computers directly in that interface.
+It's like pretty remarkable, honestly.
+So you've spoken a lot about the idea of software 2.0.
+All good ideas become like cliches so quickly, like the terms.
+It's kind of hilarious.
+It's like I think Eminem once said that like if he gets annoyed by a song he's written
+very quickly, that means it's going to be a big hit because it's too catchy.
+But can you describe this idea and how you're thinking about it has evolved over the
+months and years since since you coined it?
+Yeah.
+Yes, I had a blog post on software 2.0, I think several years ago now.
+And the reason I wrote that post is because I kept I kind of saw something remarkable
+happening in like software development and how a lot of code was being transitioned to
+be written not in sort of like C++ and so on, but it's written in the weights of a
+neural net, basically just saying that neural nets are taking over software, the realm of
+software and taking more and more tasks.
+And at the time, I think not many people understood this deeply enough that this is a big
+deal. It's a big transition.
+Neural networks were seen as one of multiple classification algorithms you might use for
+your data set problem on Kaggle.
+Like this is not that this is a change in how we program computers.
+And I saw neural nets as this is going to take over.
+The way we program computers is going to change.
+It's not going to be people writing software in C++ or something like that and directly
+programming the software. It's going to be accumulating training sets and data sets and
+crafting these objectives by which you train these neural nets.
+And at some point, there's going to be a compilation process from the data sets and the
+objective and the architecture specification into the binary, which is really just the
+neural net weights and the forward pass of the neural net.
+And then you can deploy that binary.
+And so I was talking about that sort of transition and that's what the post is about.
+And I saw this sort of play out in a lot of fields, autopilot being one of them, but
+also just simple image classification.
+People thought originally, you know, in the 80s and so on that they would write the
+algorithm for detecting a dog in an image.
+And they had all these ideas about how the brain does it.
+And first we detect corners and then we detect lines and then we stitch them up.
+And they were like really going at it.
+They were like thinking about how they're going to write the algorithm.
+And this is not the way you build it.
+And there was a smooth transition where, OK, first we thought we were going to build
+everything. Then we were building the features.
+So like hog features and things like that that detect these little statistical patterns
+from image patches. And then there was a little bit of learning on top of it, like a
+support vector machine or binary classifier for cat versus dog and images on top of the
+features. So we wrote the features, but we trained the last layer, sort of the
+classifier. And then people are like, actually, let's not even design the features
+because we can't. Honestly, we're not very good at it.
+So let's also learn the features.
+And then you end up with basically a convolutional neural net where you're learning
+most of it. You're just specifying the architecture and the architecture has tons of
+fill in the blanks, which is all the knobs, and you let the optimization write most of
+it. And so this transition is happening across the industry everywhere.
+And suddenly we end up with a ton of code that is written in neural net weights.
+And I was just pointing out that the analogy is actually pretty strong.
+And we have a lot of developer environments for software 1.0, like we have IDEs, how
+you work with code, how you debug code, how you run code, how do you maintain code?
+We have GitHub. So I was trying to make those analogies in the new realm.
+Like, what is the GitHub of software 2.0?
+Turns out it's something that looks like hugging face right now.
+You know, and so I think some people took it seriously and built cool companies.
+And many people originally attacked the post.
+It actually was not well received when I wrote it.
+And I think maybe it has something to do with the title, but the post was not well
+received. And I think more people sort of have been coming around to it over time.
+Yeah. So you were the director of AI at Tesla where I think this idea was really
+implemented at scale, which is how you have engineering teams doing software 2.0.
+So can you sort of linger on that idea of, I think we're in the really early stages
+of everything you just said, which is like GitHub IDEs.
+Like how do we build engineering teams that that work in software 2.0 systems and
+the data collection and the data annotation, which is all part of that
+software 2.0. Like, what do you think is the task of programming in software 2.0?
+Is it debugging in the space of hyperparameters or is it also debugging in
+the space of data?
+Yeah. The way by which you program the computer and influence its algorithm is
+not by writing the commands yourself.
+You're changing mostly the data set.
+You're changing the loss functions of like what the neural net is trying to do, how
+it's trying to predict things. But basically the data sets and the architecture of
+the neural net. And so in the case of the autopilot, a lot of the data sets have to
+do with, for example, detection of objects and lane line markings and traffic lights
+and so on. So you accumulate massive data sets of here's an example, here's the
+desired label, and then here's roughly how the architect, here's roughly what the
+algorithm should look like. And that's a convolutional neural net.
+So the specification of the architecture is like a hint as to what the algorithm
+should roughly look like. And then the fill in the blanks process of optimization is
+the training process. And then you take your neural net that was trained, it gives
+all the right answers on your data set and you deploy it.
+So there is in that case, perhaps at all machine learning cases, there's a lot of
+tasks. So is coming up, formulating a task like for a multi-headed neural network is
+formulating a task part of the programming? Yeah, very much so. How you break down a
+problem into a set of tasks. Yeah. I'm on a high level, I would say, if you look at
+the software running in the autopilot, I gave a number of talks on this topic. I
+would say originally a lot of it was written in software 1.0. There's imagine lots of C++,
+right? And then gradually there was a tiny neural net that was, for example, predicting, given a
+single image, is there like a traffic light or not? Or is there a landline marking or not?
+And this neural net didn't have too much to do in the scope of the software. It was making tiny
+predictions on individual little image. And then the rest of the system stitched it up. So, okay,
+we're actually, we don't have just a single camera, we have eight cameras. We actually have eight
+cameras over time. And so what do you do with these predictions? How do you put them together?
+How do you do the fusion of all that information? And how do you act on it? All of that was written
+by humans in C++. And then we decided, okay, we don't actually want to do all of that fusion
+in C++ code because we're actually not good enough to write that algorithm. We want the neural nets
+to write the algorithm and we want to port all of that software into the 2.0 stack. And so then we
+actually had neural nets that now take all the eight camera images simultaneously and make
+predictions for all of that. And actually they don't make predictions in the space of images,
+they now make predictions directly in 3D. And actually they don't in three dimensions around
+the car. And now actually we don't manually fuse the predictions in 3D over time. We don't trust
+ourselves to write that tracker. So actually we give the neural net the information over time.
+So it takes these videos now and makes those predictions. And so you're sort of just like
+putting more and more power into the neural net, more processing. And at the end of it, the
+eventual goal is to have most of the software potentially be in the 2.0 land because it works
+significantly better. Humans are just not very good at writing software basically.
+So the prediction is happening in this 4D land with three dimensional world over time. How do you
+do annotation in that world? So data annotation, whether it's self-supervised or manual by humans
+is a big part of the software 2.0 world. Right. I would say by far in the industry,
+if you're talking about the industry and what is the technology of what we have available,
+everything is supervised learning. So you need data sets of input, desired output,
+and you need lots of it. And there are three properties of it that you need. You need it to
+be very large, you need it to be accurate, no mistakes, and you need it to be diverse.
+You don't want to just have a lot of correct examples of one thing. You need to really cover
+the space of possibility as much as you can. And the more you can cover the space of possible inputs,
+the better the algorithm will work at the end. Now, once you have really good data sets that you're
+collecting, curating, and cleaning, you can train your neural net on top of that. So a lot of the
+work goes into cleaning those data sets. Now, as you pointed out, it could be the question is,
+how do you achieve a ton of... If you want to basically predict in 3D, you need data in 3D
+to back that up. So in this video, we have eight videos coming from all the cameras of the system.
+And this is what they saw. And this is the truth of what actually was around. There was this car,
+there was this car, this car. These are the lane line markings. This is the geometry of the road.
+There was traffic light in this three-dimensional position. You need the ground truth. And so the
+big question that the team was solving, of course, is how do you arrive at that ground truth? Because
+once you have a million of it, and it's large, clean, and diverse, then training a neural net
+on it works extremely well. And you can ship that into the car. And so there's many mechanisms by
+which we collected that training data. You can always go for human annotation. You can go for
+simulation as a source of ground truth. You can also go for what we call the offline tracker
+that we've spoken about at the AI day and so on, which is basically an automatic reconstruction
+process for taking those videos and recovering the three-dimensional reality of what was around
+that car. So basically think of doing a three-dimensional reconstruction as an
+offline thing, and then understanding that, okay, there's 10 seconds of video. This is what we saw.
+And therefore, here's all the lane lines, cars, and so on. And then once you have that annotation,
+you can train your neural net to imitate it. And how difficult is the three-D reconstruction?
+It's difficult, but it can be done. So there's overlap between the cameras
+and you do the reconstruction. And there's perhaps if there's any inaccuracy,
+so that's caught in the annotation step. Yes. The nice thing about the annotation is that it is
+fully offline. You have infinite time. You have a chunk of one minute and you're trying to just
+offline in a supercomputer somewhere, figure out where were the positions of all the cars,
+all the people, and you have your full one minute of video from all the angles.
+And you can run all the neural nets you want, and they can be very efficient, massive neural nets.
+There can be neural nets that can't even run in the car later at test time. So they can be even
+more powerful neural nets than what you can eventually deploy. So you can do anything you
+want, three-dimensional reconstruction, neural nets, anything you want just to recover that truth,
+and then you supervise that truth. What have you learned? You said no mistakes about humans
+doing annotation because I assume humans are... There's like a range of things they're good at
+in terms of clicking stuff on screen. Isn't that... How interesting is that to you of a problem of
+designing an annotator where humans are accurate, enjoy it? What are even the metrics? Are efficient
+or productive, all that kind of stuff? Yeah. So I grew the annotation team at
+Tesla from basically zero to a thousand while I was there. That was really interesting. My background
+is a PhD student researcher, so growing that kind of an organization was pretty crazy.
+But yeah, I think it's extremely interesting and part of the design process very much behind the
+autopilot as to where you use humans. Humans are very good at certain kinds of annotations.
+They're very good, for example, at two-dimensional annotations of images. They're not good at
+annotating cars over time in three-dimensional space, very, very hard. And so that's why we're
+very careful to design the tasks that are easy to do for humans versus things that should be left to
+the offline tracker. Like maybe the computer will do all the triangulation and 3D reconstruction,
+but the human will say exactly these pixels of the image are a car, exactly these pixels are human.
+And so co-designing the data annotation pipeline was very much
+bread and butter, was what I was doing daily. Do you think there's still a lot of open problems
+in that space? Just in general, annotation where the stuff the machines are good at,
+machines do and the humans do what they're good at, and there's maybe some iterative process.
+Right. I think to a very large extent, we went through a number of iterations and we learned a
+ton about how to create these data sets. I'm not seeing big open problems. Originally when I joined,
+I was really not sure how this would turn out. But by the time I left, I was much more secure and
+understand the philosophy of how to create these data sets. And I was pretty comfortable with
+where that was at the time. So what are strengths and limitations of cameras for the driving task
+in your understanding when you formulate the driving task as a vision task with eight cameras?
+You've seen that the entire, most of the history of the computer vision field,
+when it has to do with neural networks, just if you step back, what are the strengths and limitations
+of pixels, of using pixels to drive? Yeah. Pixels I think are a beautiful sensor,
+beautiful sensor, I would say. The thing is like cameras are very, very cheap and they provide a
+ton of information, ton of bits. Also it's extremely cheap sensor for a ton of bits. And each one of
+these bits is a constraint on the state of the world. And so you get lots of megapixel images,
+very cheap. And it just gives you all these constraints for understanding what's actually
+out there in the world. So vision is probably the highest bandwidth sensor. It's a very high
+bandwidth sensor. I love that pixels is a constraint on the world. It's this highly complex,
+high bandwidth constraint on the state of the world. And it's not just that, but again, this
+real importance of it's the sensor that humans use. Therefore, everything is designed for that
+sensor. The text, the writing, the flashing signs, everything is designed for vision. And so
+you just find it everywhere. And so that's why that is the interface you want to be in,
+talking again about these universal interfaces. And that's where we actually want to measure the
+world as well and then develop software for that sensor. But there's other constraints on the state
+of the world that humans use to understand the world. I mean, vision ultimately is the main one,
+but we're referencing our understanding of human behavior and some common sense physics
+that could be inferred from vision from a perception perspective. But it feels like
+we're using some kind of reasoning to predict the world, not just the pixels.
+I mean, you have a powerful prior service for how the world evolves over time, et cetera. So it's
+not just about the likelihood term coming up from the data itself telling you about what you are
+observing, but also the prior term of where are the likely things to see and how do they likely
+move and so on. And the question is how complex is the range of possibilities that might happen
+in the driving task? Is that to you still an open problem of how difficult is driving,
+like philosophically speaking? All the time you worked on driving, do you understand how
+hard driving is? Yeah, driving is really hard because it has to do with the predictions of
+all these other agents and the theory of mind and what they're going to do and are they looking
+at you? Where are they looking? Where are they thinking? There's a lot that goes there at the
+full tail of the expansion of the knives that we have to be comfortable with eventually.
+The final problems are of that form. I don't think those are the problems that are very common.
+I think eventually they're important, but it's really in the tail end.
+In the tail end, the rare edge cases. From the vision perspective, what are the toughest parts
+of the vision problem of driving? Well, basically the sensor is extremely powerful,
+but you still need to process that information. And so going from brightnesses of these special
+values to, hey, here are the three-dimensional world is extremely hard. And that's what the
+neural networks are fundamentally doing. And so the difficulty really is in just doing an extremely
+good job of engineering the entire pipeline, the entire data engine, having the capacity to train
+these neural nets, having the ability to evaluate the system and iterate on it. So I would say just
+doing this in production at scale is like the hard part. It's an execution problem.
+So the data engine, but also the deployment of the system such that it has low latency performance.
+So it has to do all these steps. Yeah, for the neural net specifically,
+just making sure everything fits into the chip on the car. And you have a finite budget of flops
+that you can perform and memory bandwidth and other constraints. And you have to make sure it
+flies and you can squeeze in as much computer as you can into the tiny. What have you learned from
+that process? Because maybe that's one of the bigger, like new things coming from a research
+background where there's a system that has to run under heavily constrained resources,
+has to run really fast. What kind of insights have you learned from that?
+Yeah, I'm not sure if there's too many insights. You're trying to create a neural net that will
+fit in what you have available and you're always trying to optimize it. And we talked a lot about
+it on the AI day and basically the triple backflips that the team is doing to make sure it all fits
+and utilizes the engine. So I think it's extremely good engineering. And then there's all kinds of
+little insights peppered in on how to do it properly. Let's actually zoom out because I
+don't think we talked about the data engine, the entirety of the layouts of this idea that I think
+is just beautiful with humans in the loop. Can you describe the data engine? Yeah, the data engine is
+what I call the almost biological feeling like process by which you perfect the training sets
+for these neural networks. So because most of the programming now is in the level of these data sets
+and make sure they're large, diverse and clean. Basically, you have a data set that you think is
+good. You train your neural net, you deploy it, and then you observe how well it's performing.
+And you're trying to always increase the quality of your data set. So you're trying to catch
+scenarios basically that are basically rare. And it is in these scenarios that the neural nets
+will typically struggle in because they weren't told what to do in those rare cases in the data
+set. But now you can close the loop because if you can now collect all those at scale, you can then
+feed them back into the reconstruction process I described and reconstruct the truth in those cases
+and add it to the data set. And so the whole thing ends up being like a staircase of improvement
+of perfecting your training set. And you have to go through deployments so that you can mine
+the parts that are not yet represented well in the data set. So your data set is basically imperfect.
+It needs to be diverse. It has pockets that are missing and you need to pad out the pockets. You
+can sort of think of it that way in the data. What role do humans play in this? So what's this
+biological system? Like are human bodies made up of cells? What role, like how do you optimize the
+human system? The multiple engineers collaborating, figuring out what to focus on, what to contribute,
+which task to optimize in this neural network. Who is in charge of figuring out which task needs
+more data? Can you speak to the hyperparameters of the human system? It really just comes down
+to extremely good execution from an engineering team who knows what they're doing. They understand
+intuitively the philosophical insights underlying the data engine and the process by which the
+system improves and how to again, delegate the strategy of the data collection and how that
+works and then just making sure it's all extremely well executed. And that's where most of the work
+is not even the philosophizing or the research or the ideas of it. It's just extremely good
+execution. It's so hard when you're dealing with data at that scale. So your role in the data engine
+executing well on it is difficult and extremely important. Is there a priority of like a vision
+board of saying like, we really need to get better at stoplights? Yeah. Like the prioritization of
+tasks. Is that essentially, and that comes from the data? That comes to a very large extent to
+what we are trying to achieve in the product for a map or the release we're trying to get out
+in the feedback from the QA team where the system is struggling or not, the things we're
+trying to improve. And the QA team gives some signal, some information in aggregate about the
+performance of the system in various conditions. That's right. And then of course, all of us drive
+it and we can also see it. It's really nice to work with a system that you can also experience
+yourself and it drives you home. Is there some insight you can draw from your individual
+experience that you just can't quite get from an aggregate statistical analysis of data? Yeah.
+It's so weird, right? Yes. It's not scientific in a sense because you're just one anecdotal sample.
+Yeah. I think there's a ton of, it's a source of truth. It's your interaction with the system
+and you can see it, you can play with it, you can perturb it, you can get a sense of it,
+you have an intuition for it. I think numbers just like have a way of, numbers and plots and graphs
+are much harder. It hides a lot of- It's like if you train a language model,
+it's a really powerful way is by you interacting with it. Yeah, 100%.
+Try to build up an intuition. Yeah. I think like Ilan also, he always wanted to drive the system
+himself. He drives a lot and I want to say almost daily. So he also sees this as a source of truth,
+you driving the system and it performing and yeah.
+So what do you think? Tough questions here. So Tesla last year removed radar from
+the sensor suite and now just announced that it's going to remove ultrasonic sensors
+relying solely on vision, so camera only. Does that make the perception problem harder or easier?
+I would almost reframe the question in some way. So the thing is basically,
+you would think that additional sensors- By the way, can I just interrupt?
+Go ahead. I wonder if a language model will ever do that if you prompt it. Let me reframe your
+question. That would be epic. That's the wrong prompt. Sorry. It's like a little bit of a wrong
+question because basically you would think that these sensors are an asset to you. Yeah. But if
+you fully consider the entire product in its entirety, these sensors are actually potentially
+liability because these sensors aren't free. They don't just appear on your car. You need
+suddenly you need to have an entire supply chain. You have people procuring it. There can be
+problems with them. They may need replacement. They are part of the manufacturing process. They
+can hold back the line in production. You need to source them. You need to maintain them. You have
+to have teams that write the firmware, all of it. And then you also have to incorporate them,
+fuse them into the system in some way. And so it actually like bloats a lot of it. And I think
+Elon is really good at simplify, simplify. Best part is no part. And he always tries to throw away
+things that are not essential because he understands the entropy in organizations and in the approach.
+And I think in this case, the cost is high and you're not potentially seeing it if you're just a
+computer vision engineer. And I'm just trying to improve my network and is it more useful or less
+useful? How useful is it? And the thing is once you consider the full cost of a sensor, it actually
+is potentially a liability. And you need to be really sure that it's giving you extremely useful
+information. In this case, we looked at using it or not using it and the Delta was not massive.
+And so it's not useful. Is it also bloat in the data engine? Like having more sensors? Is it
+distraction? And these sensors, you know, they can change over time. For example, you can have one
+type of say radar, you can have other type of radar. They change over time. Now you suddenly
+need to worry about it. Now suddenly you have a column in your SQLite telling you, oh, what
+sensor type was it? And they all have different distributions. And then they can, they just,
+they contribute noise and entropy into everything. And they bloat stuff. And also organizationally
+has been really fascinating to me that it can be very distracting. If you, if all, if you only
+want to get to work is vision, all the resources are on it and you're building out a data engine
+and you're actually making forward progress because that is the sensor with the most bandwidth,
+the most constraints in the world. And you're investing fully into that. And you can make that
+extremely good. If you're, you're only a finite amount of sort of spend of focus across different
+facets of the system. And this kind of reminds me of Rich Sutton's, the bitter lesson.
+It just seems like simplifying the system. Yeah. In the long run. And of course, you don't know
+what the long run is. It seems to be always the right solution. Yeah. Yes. In that case, it was
+for RL, but it seems to apply generally across all systems that do computation. Yeah. So where,
+what do you think about the lidar as a crutch debate? The battle between point clouds and pixels.
+Yeah. I think this debate is always like slightly confusing to me because it seems like the actual
+debate should be about like, do you have the fleet or not? That's like the really important
+thing about whether you can achieve a really good functioning of an AI system at this scale. So data
+collection systems. Yeah. Do you have a fleet or not is significantly more important, whether you
+have lidar or not. It's just another sensor. And yeah, I think similar to the radar discussion,
+basically, I don't think it basically doesn't offer extra information. It's extremely costly.
+It has all kinds of problems. You have to worry about it. You have to calibrate it,
+et cetera. It creates bloat and entropy. You have to be really sure that you need this sensor.
+In this case, I basically don't think you need it. And I think honestly, I will make a stronger
+statement. I think the others, some of the other companies that are using it are probably going
+to drop it. Yeah. So you have to consider the sensor in the full, in considering, can you build
+a big fleet that collects a lot of data? And can you integrate that sensor with that data and that
+sensor into a data engine that's able to quickly find different parts of the data that then
+continuously improves whatever the model that you're using? Yeah. Another way to look at it is like
+vision is necessary in the sense that the world is designed for human visual consumption. So you
+need vision. It's necessary. And then also it is sufficient because it has all the information that
+you need for driving and humans obviously has vision to drive. So it's both necessary and
+sufficient. So you want to focus resources and you have to be really sure if you're going to
+bring in other sensors. You could add sensors to infinity. At some point, you need to draw the line.
+And I think in this case, you have to really consider the full cost of any one sensor.
+That you're adopting and do you really need it? And I think the answer in this case is no.
+So what do you think about the idea that the other companies are forming high resolution maps
+and constraining heavily the geographic regions in which they operate? Is that approach not in your
+view, not going to scale over time to the entirety of the United States? I think as you mentioned,
+they pre-map all the environments and they need to refresh the map. And they have a perfect
+centimeter level accuracy map of everywhere they're going to drive. It's crazy. We've been
+talking about the autonomy actually changing the world. We're talking about the deployment
+on a global scale of autonomous systems for transportation. And if you need to maintain
+a centimeter accurate map for Earth or for many cities and keep them updated, it's a huge
+dependency that you're taking on. Huge dependency. It's a massive, massive dependency. And now you
+need to ask yourself, do you really need it? And humans don't need it. So it's very useful to have
+a low level map of like, okay, the connectivity of your road. You know that there's a fork coming up.
+When you drive an environment, you have that high level understanding. It's like a small Google map
+and Tesla uses Google map, similar kind of resolution information in the system, but it
+will not pre-map environments to send me a level of accuracy. It's a crutch. It's a distraction.
+It costs entropy and it diffuses the team. It dilutes the team. And you're not focusing
+on what's actually necessary, which is the computer vision problem. What did you learn
+about machine learning, about engineering, about life, about yourself as one human being
+from working with Elon Musk? I think the most I've learned is about how to sort of run organizations
+efficiently and how to create efficient organizations and how to fight entropy in an organization.
+So human engineering in the fight against entropy. Yeah. I think Elon is a very efficient warrior
+in the fight against entropy in organizations. What does entropy in an organization look like?
+It's process. It's process and inefficiencies in the form of meetings and that kind of stuff.
+Yeah. Meetings. He hates meetings. He keeps telling people to skip meetings if they're not useful.
+He basically runs the world's biggest startups, I would say. Tesla, SpaceX are the world's biggest
+startups. Tesla actually has multiple startups. I think it's better to look at it that way.
+And so I think he's extremely good at that. And yeah, he has a very good intuition for
+streamlining processes, making everything efficient. Best part is no part, simplifying, focusing,
+and just kind of removing barriers, moving very quickly, making big moves.
+All of this is very startupy sort of seeming things, but at scale.
+So strong drive to simplify. From your perspective, I mean, that also probably applies to just
+designing systems and machine learning and otherwise. Like simplify, simplify.
+Yes. What do you think is the secret to maintaining the startup culture in a company that grows?
+Can you introspect that?
+I do think you need someone in a powerful position with a big hammer like Elon, who's like
+the cheerleader for that idea and ruthlessly pursues it. If no one has a big enough hammer,
+everything turns into committees, democracy within the company, process, talking to stakeholders,
+decision making, just everything just crumbles. If you have a big person who's also really smart
+and has a big hammer, things move quickly. So you said your favorite scene in Interstellar
+is the intense docking scene with the AI and Cooper talking, saying,
+Cooper, what are you doing docking? It's not possible. No, it's necessary. Such a good line.
+By the way, just so many questions there. Why an AI in that scene, presumably is supposed to be
+able to compute a lot more than the human. It's saying it's not optimal. Why the human? I mean,
+that's a movie, but shouldn't the AI know much better than the human? Anyway, what do you think
+is the value of setting seemingly impossible goals? Our initial intuition, which seems like
+something that you have taken on that Elon espouses, where the initial intuition of the
+community might say this is very difficult and then you take it on anyway with a crazy deadline.
+You just from a human engineering perspective, have you seen the value of that?
+I wouldn't say that setting impossible goals exactly is a good idea, but I think setting very
+ambitious goals is a good idea. I think there's what I call sub-linear scaling of difficulty,
+which means that 10x problems are not 10x hard. Usually 10x harder problem is like 2 or 3x harder
+to execute on. If you want to improve a system by 10%, it costs some amount of work. If you want to
+10x improve the system, it doesn't cost 100x amount of work. It's because you fundamentally
+change the approach. If you start with that constraint, then some approaches are obviously
+dumb and not going to work. It forces you to reevaluate. I think it's a very interesting way
+of approaching problem solving. It requires a weird kind of thinking. Going back to your PhD
+days, how do you think which ideas in the machine learning community are solvable?
+Yes.
+It requires, what is that? There's the cliche of first principles thinking, but it requires
+to basically ignore what the community is saying because doesn't a community in science usually
+draw lines of what is and isn't possible? It's very hard to break out of that without going crazy.
+I think a good example here is the deep learning revolution in some sense because you could
+be in computer vision at that time during the deep learning revolution of 2012 and so on.
+You could be improving a computer vision stack by 10% or you can just be saying,
+actually all of this is useless. How do I do 10x better computer vision? Well, it's not probably
+by tuning a hog feature detector. I need a different approach. I need something that is
+scalable. Going back to Richard Sutton's understanding the philosophy of the bitter lesson
+and then being like, actually I need much more scalable system like a neural network
+that in principle works and then having some deep believers that can actually
+execute on that mission and make it work. That's the 10x solution.
+What do you think is the timeline to solve the problem of autonomous driving?
+That's still in part an open question.
+Yeah. I think the tough thing with timelines of self-driving obviously is that no one has created
+self-driving. It's not like, what do you think is the timeline to build this bridge? Well,
+we've built million bridges before. Here's how long that takes. No one has built autonomy. It's
+not obvious. Some parts turn out to be much easier than others. It's really hard to forecast. You do
+your best based on trend lines and so on and based on intuition, but that's why fundamentally it's
+just really hard to forecast this. Even still being inside of it, it's hard to do. Yes. Some
+things turn out to be much harder and some things turn out to be much easier. Do you try to avoid
+making forecasts? Because Elon doesn't avoid them, right? Heads of car companies in the past have
+not avoided it either. Ford and other places have made predictions that we're going to solve
+at level four driving by 2020, 2021, whatever. They're all kind of backtracking that prediction.
+Are you, as an AI person, do you for yourself privately make predictions or do they get in
+the way of your actual ability to think about a thing? Yeah, I would say what's easy to say is
+that this problem is tractable and that's an easy prediction to make. It's tractable. It's going to
+work. Yes. It's just really hard. Some things turn out to be harder and some things turn out to be
+easier. It definitely feels tractable and it feels like at least the team at Tesla,
+which is what I saw internally, is definitely on track to that. How do you form a strong
+representation that allows you to make a prediction about tractability? You're the leader of a lot of
+humans. You have to say this is actually possible. How do you build up that intuition? It doesn't
+have to be even driving. It could be other tasks. What difficult tasks did you work on in your life?
+Classification, achieving certain, just an image net, certain level of superhuman level performance.
+Yeah, expert intuition. It's just intuition. It's belief.
+So just thinking about it long enough, studying, looking at sample data, like you said, driving.
+My intuition is really flawed on this. I don't have a good intuition about tractability.
+It could be anything. It could be solvable. The driving task could be
+simplified into something quite trivial. The solution to the problem would be quite trivial.
+At scale, more and more cars driving perfectly might make the problem much easier. The more
+cars you have driving, people learn how to drive correctly, not correctly, but in a way that's more
+optimal for a heterogeneous system of autonomous and semi-autonomous and manually driven cars.
+That could change stuff. Then again, also I've spent a ridiculous number of hours just staring
+at pedestrians crossing streets, thinking about humans. It feels like the way we use our eye
+contact, it sends really strong signals. There's certain quirks and edge cases of behavior. Of
+course, a lot of the fatalities that happen have to do with drunk driving and both on the
+pedestrian side and the driver side. There's that problem of driving at night and all that kind of.
+It's like the space of possible solutions to autonomous driving includes so many human factor
+issues that it's almost impossible to predict. There could be super clean, nice solutions.
+I would say definitely like to use a game analogy, there's some fog of war,
+but you definitely also see the frontier of improvement. You can measure historically how
+much you've made progress. I think, for example, at least what I've seen in roughly five years at
+Tesla, when I joined, it barely kept lane on the highway. I think going up from Palo Alto to SF
+was like three or four interventions. Anytime the road would do anything geometrically or turn too
+much, it would just not work. Going from that to a pretty competent system in five years and seeing
+what happens also under the hood and what the scale of which the team is operating now with
+respect to data and compute and everything else is just massive progress. You're climbing a mountain
+and it's fog, but you're making a lot of progress. It's fog. You're making progress and you see what
+the next directions are and you're looking at some of the remaining challenges and they're not
+perturbing you and they're not changing your philosophy and you're not contorting yourself.
+You're like, actually, these are the things that we still need to do. Yeah, the fundamental
+components of solving the problem seem to be there from the data engine to the compute to the
+compute on the car to the compute for the training, all that kind of stuff.
+So you've done over the years, you've been at Tesla, you've done a lot of amazing
+breakthrough ideas and engineering, all of it from the data engine to the human side, all of it.
+Can you speak to why you chose to leave Tesla? Basically, as I described that, Ren, I think over
+time during those five years, I've gotten myself into a bit of a managerial position.
+Most of my days were meetings and growing the organization and making decisions about high
+level strategic decisions about the team and what it should be working on and so on. It's like a
+corporate executive role and I can do it. I think I'm okay at it, but it's not fundamentally what I
+enjoy. I think when I joined, there was no computer vision team because Tesla was just going from the
+transition of using Mobileye, a third party vendor for all of its computer vision, to having to
+build its computer vision system. So when I showed up, there were two people training deep neural
+networks and they were training them at a computer at their legs. They were doing some kind of basic
+classification task. Yeah. And so I kind of grew that into what I think is a fairly respectable
+deep learning team, a massive compute cluster, a very good data annotation organization.
+And I was very happy with where that was. It became quite autonomous. And so I kind of
+stepped away and I'm very excited to do much more technical things again. Yeah. And kind of like,
+we focus on AGI. What was that soul searching like? Cause you took a little time off and think like
+what, how many mushrooms did you take? No, I'm just kidding. I mean, what was going through your mind?
+The human lifetime is finite. Yeah. You did a few incredible things here. You're one of the best
+teachers of AI in the world. You're one of the best. And I don't mean that I mean that in the
+best possible way. You're one of the best tinkerers in the AI world, meaning like understanding the
+fundamentals of how something works by building it from scratch and playing with it with the
+basic intuitions. It's like Einstein, Feynman, we're all really good at this kind of stuff.
+Like small example of a thing to play with it, to try to understand it. So that, and obviously now
+with Tessa, you helped build a team of machine learning, like engineers and assistant that
+actually accomplishes something in the real world. So given all that, like what was the soul searching
+like? Well, it was hard because obviously I love the company a lot and I love Elon, I love Tesla.
+It was always hard to leave. I love the team basically. But yeah, I think actually I will be
+potentially like interested in revisiting it. Maybe coming back at some point,
+working in Optimus, working in AGI at Tesla. I think Tesla is going to do incredible things.
+It's basically like, it's a massive large scale robotics kind of company with a ton of in-house
+talent for doing really incredible things. And I think human robots are going to be amazing.
+I think autonomous transportation is going to be amazing. All this is happening at Tesla. So I
+think it's just a really amazing organization. So being part of it and helping it along, I think
+was very, basically I enjoyed that a lot. Yeah, it was basically difficult for those reasons because
+I love the company. But I'm happy to potentially at some point come back for Act 2. But I felt
+like at this stage, I built the team, it felt autonomous and I became a manager and I wanted
+to do a lot more technical stuff. I wanted to learn stuff. I wanted to teach stuff. And I just
+kind of felt like it was a good time for a change of pace a little bit. What do you think is
+the best movie sequel of all time, speaking of part two? Because most of them suck. Movie sequels?
+Movie sequels, yeah. And you tweet about movies. So just in a tiny tangent,
+what's a favorite movie sequel? Godfather part two. Are you a fan of Godfather? Because you
+didn't even tweet or mention the Godfather. Yeah, I don't love that movie. I know it has a
+huge follow-up. We're going to edit that out. We're going to edit out the hate towards the Godfather.
+How dare you disrespect- I think I will make a strong statement. I don't know why.
+I don't know why, but I basically don't like any movie before 1995. Something like that.
+Didn't you mention Terminator 2? Okay. Okay. That's like Terminator 2 was
+a little bit later, 1990. No, I think Terminator 2 was in the 80s.
+And I like Terminator 1 as well. So, okay. So like few exceptions, but by and large,
+for some reason, I don't like movies before 1995 or something. They feel very slow. The camera is
+like zoomed out. It's boring. It's kind of naive. It's kind of weird.
+And also Terminator was very much ahead of its time.
+Yes. And the Godfather, there's like no AGI.
+I mean, but you have Good Will Hunting was one of the movies you mentioned,
+and that doesn't have any AGI either. I guess it has mathematics.
+Yeah. I guess occasionally I do enjoy movies that don't feature-
+Or like Anchorman. That's- Anchorman is so good.
+I don't understand. Speaking of AGI, because I don't understand why Will Ferrell is so funny.
+It doesn't make sense. It doesn't compute. There's just something about him.
+And he's a singular human because you don't get that many comedies these days. And I wonder if
+it has to do about the culture or the machine of Hollywood, or does it have to do with just
+we got lucky with certain people in comedy. It came together because he is a singular human.
+Yeah. I like his movies.
+That was a ridiculous tangent. I apologize. But you mentioned humanoid robots. So what do you
+think about Optimus, about Tesla Bot? Do you think we'll have robots in the factory and in the home
+in 10, 20, 30, 40, 50 years? Yeah. I think it's a very hard project.
+I think it's going to take a while. But who else is going to build humanoid robots at scale?
+And I think it is a very good form factor to go after because like I mentioned,
+the world is designed for humanoid form factor. These things would be able to operate our machines.
+They would be able to sit down in chairs, potentially even drive cars. Basically,
+the world is designed for humans. That's the form factor you want to invest into and make work over
+time. I think there's another school of thought, which is, okay, pick a problem and design a robot
+to it. But actually designing a robot and getting a whole data engine and everything behind it to
+work is actually an incredibly hard problem. So it makes sense to go after general interfaces
+that, okay, they are not perfect for any one given task, but they actually have the generality
+of just with a prompt with English, able to do something across. And so I think it makes a lot
+of sense to go after a general interface in the physical world. And I think it's a very
+difficult project. I think it's going to take time. But I see no other company that can execute on
+that vision. I think it's going to be amazing. Basically physical labor. If you think transportation
+is a large market, try physical labor. It's insane. But it's not just physical labor. To me,
+the thing that's also exciting is social robotics. So the relationship we'll have on different levels
+with those robots. That's why I was really excited to see Optimus. People have criticized me
+for the excitement. But I've worked with a lot of research labs that do humanoid-legged robots,
+Boston Dynamics, Unitree. There's a lot of companies that do legged robots.
+But the elegance of the movement is a tiny, tiny part of the big picture. So integrating the two
+big exciting things to me about Tesla doing humanoid or any legged robots is clearly integrating
+into the data engine. So the data engine aspect, so the actual intelligence for the perception and
+the control and the planning and all that kind of stuff, integrating into the fleet that you
+mentioned. And then speaking of fleet, the second thing is the mass manufacturers. Just knowing
+culturally driving towards a simple robot that's cheap to produce at scale and doing that well,
+having experience to do that well, that changes everything. That's a very different culture
+and style than Boston Dynamics, who by the way, those robots are just the way they move.
+It'll be a very long time before Tesla can achieve the smoothness of movement,
+but that's not what it's about. It's about the entirety of the system, like we talked about,
+the data engine and the fleet. That's super exciting. Even the initial models. But that,
+too, was really surprising that in a few months you can get a prototype.
+The reason that happened very quickly is, as you alluded to, there's a ton of copy paste from
+what's happening on the autopilot. A lot. The amount of expertise that came out of the woodworks
+at Tesla for building the human robot was incredible to see. Basically, Elon said at one
+point, we're doing this. And then next day, basically, all these CAD models started to appear.
+People talk about the supply chain and manufacturing. People showed up with
+screwdrivers and everything the other day and started to put together the body. I was like,
+whoa. All these people exist at Tesla. Fundamentally, building a car is actually
+not that different from building a robot. That is true, not just for the hardware pieces. Also,
+let's not forget hardware, not just for a demo, but manufacturing of that hardware at scale.
+It is a whole different thing. But for software as well, basically, this robot currently thinks
+it's a car. It's going to have a midlife crisis at some point. It thinks it's a car. Some of the
+earlier demos, actually, we were talking about potentially doing them outside in the parking lot
+because that's where all of the computer vision was working out of the box instead of inside.
+All the operating system, everything just copy pastes. Computer vision mostly copy pastes. You
+have to retrain the neural nets, but the approach and everything and data engine and offline
+trackers and the way we go about the occupancy tracker and so on, everything copy pastes. You
+just need to retrain the neural nets. Then the planning control, of course, has to change quite
+a bit. But there's a ton of copy paste from what's happening at Tesla. If you were to go
+with the goal of like, okay, let's build a million human robots and you're not Tesla,
+that's a lot to ask. If you're Tesla, it's actually like, it's not that crazy.
+Yes. Then the follow-up question is then how difficult, just like with driving,
+how difficult is the manipulation task such that it can have an impact at scale? I think
+depending on the context, the really nice thing about robotics is that unless you do a
+manufacturing and that kind of stuff, is there's more room for error. Driving is so safety critical
+and also time critical. A robot is allowed to move slower, which is nice.
+Yes. I think it's going to take a long time, but the way you want to structure the development is
+you need to say, okay, it's going to take a long time. How can I set up the product development
+roadmap so that I'm making revenue along the way? I'm not setting myself up for a zero one
+loss function where it doesn't work until it works. You don't want to be in that position.
+You want to make it useful almost immediately, and then you want to slowly deploy it
+and at scale. And you want to set up your data engine, your improvement loops, the telemetry,
+the evaluation, the harness and everything. And you want to improve the product over time
+incrementally and you're making revenue along the way. That's extremely important because otherwise
+you cannot build these large undertakings just like don't make sense economically.
+And also from the point of view of the team working on it, they need the dopamine along the way.
+They're not just going to make a promise about this being useful. This is going to change the
+world in 10 years when it works. This is not where you want to be. You want to be in a place
+like I think Autopilot is today where it's offering increased safety and convenience of driving today.
+People pay for it. People like it. People will purchase it. And then you also have the greater
+mission that you're working towards. And you see that. So the dopamine for the team,
+that was a source of happiness and satisfaction. Yes, 100%. You're deploying this. People like it.
+People drive it. People pay for it. They care about it. There's all these YouTube videos.
+Your grandma drives it. She gives you feedback. People like it. People engage with it. You engage
+with it. Huge. Do people that drive Teslas recognize you and give you love? Like, hey, thanks for this
+nice feature that it's doing. Yeah, I think the tricky thing is like some people really love you.
+Some people, unfortunately, like you're working on something that you think is extremely valuable,
+useful, etc. Some people do hate you. There's a lot of people who like me and the team and
+the whole project. And I think Tesla drivers, many cases they're not actually. Yeah, that's
+actually makes me sad about humans or the current ways that humans interact. I think that's actually
+fixable. I think humans want to be good to each other. I think Twitter and social media is part
+of the mechanism that actually somehow makes the negativity more viral, that it doesn't deserve
+disproportionately add a viral boost to the negativity. But I wish people would just get
+excited about, so suppress some of the jealousy, some of the ego and just get excited for others.
+And then there's a karma aspect to that. You get excited for others, they'll get excited for you.
+Same thing in academia. If you're not careful, there is a dynamical system there.
+If you think of in silos and get jealous of somebody else being successful, that actually,
+perhaps counterintuitively, leads to less productivity of you as a community and you
+individually. I feel like if you keep celebrating others, that actually makes you more successful.
+Yeah. I think people haven't, depending on the industry, haven't quite learned that yet.
+Some people are also very negative and very vocal. They're very prominently featured,
+but actually there's a ton of people who are cheerleaders, but they're silent cheerleaders.
+And when you talk to people just in the world, they will tell you, it's amazing, it's great.
+Especially people who understand how difficult it is to get this stuff working. People who have
+built products and makers, entrepreneurs, making this work and changing something
+is incredibly hard. Those people are more likely to cheerlead you.
+Well, one of the things that makes me sad is some folks in the robotics community
+don't do the cheerleading and they should because they know how difficult it is. Well,
+they actually sometimes don't know how difficult it is to create a product that's scale. They
+actually deploy it in the real world. A lot of the development of robots and AI system is done on
+very specific small benchmarks as opposed to real world conditions.
+Yes. Yeah. I think it's really hard to work on robotics in an academic setting.
+Or AI systems that apply in the real world. You've criticized, you flourished and loved for time the
+ImageNet, the famed ImageNet data set. And I've recently had some words of criticism that the
+academic research ML community gives a little too much love still to the ImageNet or like
+those kinds of benchmarks. Can you speak to the strengths and weaknesses of data sets
+used in machine learning research? Actually, I don't know that I recall
+a specific instance where I was unhappy or criticizing ImageNet. I think ImageNet has
+been extremely valuable. It was basically a benchmark that allowed the deep learning community
+to demonstrate that deep neural networks actually work. There's a massive value in that.
+I think ImageNet was useful, but basically it's become a bit of an MNIST at this point.
+MNIST is like little 228 by 28 grayscale digits. There's a joke data set that everyone just crushes.
+There's still papers written on MNIST though, right?
+Maybe they shouldn't.
+Strong papers. Like papers that focus on how do we learn with a small amount of data, that kind of
+stuff. Yeah. I could see that being helpful, but not in mainline computer vision research anymore,
+of course. I think the way I've heard you somewhere, maybe I'm just imagining things,
+but I think you said ImageNet was a huge contribution to the community for a long time,
+and now it's time to move past those kinds of... Well, ImageNet has been crushed. I'm
+the error rates are... Yeah, we're getting like 90% accuracy in 1000 classification way prediction,
+and I've seen those images and it's like really high. That's really good. If I remember correctly,
+the top five error rate is now like 1% or something.
+Given your experience with a gigantic real world data set, would you like to see benchmarks move
+in a certain directions that the research community uses?
+Unfortunately, I don't think academics currently have the next ImageNet.
+I think we've crushed MNIST. We've basically crushed ImageNet, and there's no next big
+benchmark that the entire community rallies behind and uses for further development of these
+networks. Yeah. What are what it takes for a data set to captivate the imagination of everybody,
+where they all get behind it? That could also need a leader, right? Yeah. Somebody with popularity.
+Yeah. Why did ImageNet take off? Or is it just the accident of history?
+It was the right amount of difficult. It was the right amount of difficult and simple,
+and interesting enough, it just kind of like it was the right time for that kind of a data set.
+Question from Reddit. What are your thoughts on the role that synthetic data and game engines
+will play in the future of neural net model development? I think as neural nets converge
+to humans, the value of simulation to neural nets will be similar to the value of simulation to
+humans. So people use simulation because they can learn something in that kind of a system
+without having to actually experience it. But are you referring to the simulation we do in our head?
+No, sorry, simulation. I mean like video games or other forms of simulation for various professionals.
+So let me push back on that because maybe there's simulation that we do in our heads.
+Like, simulate if I do this, what do I think will happen?
+Okay. That's like internal simulation. Yeah. Internal. Isn't that what we're doing?
+Assuming before we act? Oh yeah. But that's independent from like the use of simulation in
+the sense of like computer games or using simulation for training set creation or-
+Is it independent or is it just loosely correlated? Because like, isn't that useful to do like
+counterfactual or like edge case simulation to like, you know, what happens if there's a nuclear war?
+What happens if there's, you know, like those kinds of things?
+Yeah, that's a different simulation from like Unreal Engine. That's how I interpreted the question.
+Ah, so like simulation of the average case. What's Unreal Engine?
+What do you mean by Unreal Engine? So simulating a world, physics of that world,
+why is that different? Like, because you also can add behavior to that world
+and you could try all kinds of stuff, right? You could throw all kinds of weird things into it.
+Unreal Engine is not just about simulating, I mean, I guess it is about simulating the physics
+of the world. It's also doing something with that. Yeah. The graphics, the physics, and the
+agents that you put into the environment and stuff like that. Yeah. See, I think you,
+I feel like you said that it's not that important, I guess, for the future of AI development.
+Is that correct to interpret it that way? I think humans use simulators for,
+humans use simulators and they find them useful. And so computers will use simulators and find them
+useful. Okay. So you're saying it's not that, I don't use simulators very often. I play a video
+game every once in a while, but I don't think I derive any wisdom about my own existence from
+those video games. It's a momentary escape from reality versus a source of wisdom about reality.
+So I think that's a very polite way of saying simulation is not that useful.
+Yeah, maybe not. I don't see it as like a fundamental, really important part of like
+training neural nets currently. But I think as neural nets become more and more powerful,
+I think you will need fewer examples to train additional behaviors. And simulation is, of course,
+there's a domain gap in a simulation that it's not the real world, it's slightly something different.
+But with a powerful enough neural net, you need, the domain gap can be bigger, I think,
+because neural net will sort of understand that even though it's not the real world, it like has
+all this high level structure that I'm supposed to be learning from. So the neural net will actually,
+yeah, it will be able to leverage the synthetic data better by closing the gap,
+by understanding in which ways this is not real data.
+Exactly.
+Right, I do better questions next time. That was a question, but I'm just kidding. All right.
+So is it possible, do you think, speaking of MNIST, to construct neural nets and training
+processes that require very little data? So we've been talking about huge data sets like
+the internet for training. I mean, one way to say that is, like you said, like the querying itself
+is another level of training, I guess, and that requires a little data. But do you see any value
+in doing research and kind of going down the direction of can we use very little data to train,
+to construct a knowledge base?
+100%. I just think like at some point you need a massive data set. And then when you pre-train
+your massive neural net and get something that is like a GPT or something, then you're able to be
+very efficient at training any arbitrary new task. So a lot of these GPTs, you can do tasks like
+sentiment analysis or translation or so on just by being prompted with very few examples. Here's the
+kind of thing I want you to do. Here's an input sentence, here's the translation into German.
+Input sentence, translation to German. Input sentence blank, and the neural net will complete
+the translation to German just by looking at sort of the example you've provided. And so that's an
+example of a very few shot learning in the activations of the neural net instead of the
+weights of the neural net. And so I think basically just like humans, neural nets will become very
+data efficient at learning any other new task. But at some point you need a massive data set to
+pre-train your network. To get that, and probably we humans have something like that.
+Do we have something like that? Do we have a passive in the background model constructing
+thing that just runs all the time in a self-supervised way? We're not conscious of it.
+I think humans definitely, I mean, obviously we learn a lot during our lifespan, but also we have
+a ton of hardware that helps us at initialization coming from sort of evolution. And so I think
+that's also a really big component. A lot of people in the field, I think they just talk about
+the amounts of like seconds and the, you know, that a person has lived pretending that this is
+a WLRRSA, sort of like a zero initialization of a neural net. And it's not like you can look at a
+lot of animals, like for example, zebras, zebras get born and they see and they can run. There's
+zero train data in their lifespan. They can just do that. So somehow I have no idea how evolution
+has found a way to encode these algorithms and these neural net initializations that are extremely
+good into ATCGs. And I have no idea how this works, but apparently it's possible because
+here's a proof by existence. There's something magical about going from a single cell to an
+organism that is born to the first few years of life. I kind of like the idea that the reason we
+don't remember anything about the first few years of our life is that it's a really painful process.
+Like it's a very difficult, challenging training process. Like intellectually, like
+and maybe, yeah, I mean, I don't, why don't we remember any of that? There might be some crazy
+training going on and that maybe that's the background model training that is very painful.
+And so it's best for the system once it's trained not to remember how it's constructed.
+I think it's just like the hardware for long-term memory is just not fully developed.
+I kind of feel like the first few years of infants is not actually like learning,
+it's brain maturing. We're born premature. There's a theory along those lines because of the
+birth canal and the swallowing of the brain. And so we're born premature and then the first few
+years we're just, the brain is maturing and then there's some learning eventually.
+That's my current view on it. What do you think, do you think neural nets can have long-term memory?
+Like that approach is something like humans. Do you think they need to be another meta
+architecture on top of it to add something like a knowledge base that learns facts about the world
+and all that kind of stuff? Yes, but I don't know to what extent it will be explicitly constructed.
+It might take unintuitive forms where you are telling the GPT like, hey, you have a declarative
+memory bank to which you can store and retrieve data from. And whenever you encounter some
+information that you find useful, just save it to your memory bank. And here's an example of
+something you have retrieved and how you say it and here's how you load from it. You just say load,
+whatever, you teach it in text, in English, and then it might learn to use a memory bank from that.
+Oh, so the neural net is the architecture for the background model, the base thing,
+and then everything else is just on top of it. That's pretty easy to do.
+It's not just text, right? You're giving it gadgets and gizmos. So you're teaching some kind
+of a special language by which it can save arbitrary information and retrieve it at a later
+time. And you're telling it about these special tokens and how to arrange them to use these
+interfaces. It's like, hey, you can use a calculator. Here's how you use it. Just do
+53 plus 41 equals. And when equals is there, a calculator will actually read out the answer
+and you don't have to calculate it yourself. And you just tell it in English, this might actually
+work. Do you think in that sense, Godot is interesting, the DeepMind system, that it's not
+just new language, but actually throws it all in the same pile, images, actions, all that kind
+of stuff. That's basically what we're moving towards. Yeah, I think so. So Godot is very much a
+kitchen sink approach to reinforcement learning in lots of different environments with a single
+fixed transformer model, right? I think it's a very early result in that realm, but I think,
+yeah, it's along the lines of what I think things will eventually look like.
+So this is the early days of a system that eventually will look like this from a
+rich, sudden perspective. Yeah, I'm not super huge fan of, I think, all these interfaces that
+look very different. I would want everything to be normalized into the same API. So for example,
+screen pixels, very same API, instead of having different world environments that have very
+different physics and joint configurations and appearances and whatever, and you're having some
+kind of special tokens for different games that you can plug. I'd rather just normalize everything
+to a single interface so it looks the same to the neural net, if that makes sense. So it's all
+going to be pixel-based pong in the end. I think so. Okay. Let me ask you about your own personal
+life. A lot of people want to know you're one of the most productive and brilliant people
+in the history of AI. What is a productive day in the life of Andrej Kapati look like?
+What time do you wake up? Because imagine some kind of dance between the average productive day
+and a perfect productive day. So the perfect productive day is the thing we strive towards,
+and the average is what it converges to, given all the mistakes and human eventualities and so on.
+So what time do you wake up? Are you a morning person? I'm not a morning person. I'm a night
+owl for sure. Is it stable or not? It's semi-stable, like eight or nine or something like that.
+During my PhD, it was even later, I used to go to sleep usually at 3 a.m. I think the a.m. hours
+are precious and very interesting time to work because everyone is asleep.
+At 8 a.m. or 7 a.m., the East Coast is awake. So there's already activity, there's already some
+text messages, whatever, there's stuff happening. You can go on some news website and there's stuff
+happening. It's distracting. At 3 a.m., everything is totally quiet. And so you're not going to be
+bothered and you have solid chunks of time to do work. So I like those periods, night owl by
+default. And then I think like productive time, basically, what I like to do is you need to build
+some momentum on the problem without too much distraction. And you need to load your RAM,
+your working memory with that problem. And then you need to be obsessed with it when you're taking
+shower, when you're falling asleep. You need to be obsessed with the problem and it's fully in
+your memory and you're ready to wake up and work on it right there. So is this in a scale, temporal
+scale of a single day or a couple of days, a week, a month? So I can't talk about one day,
+basically, in isolation because it's a whole process. When I want to get productive in the
+problem, I feel like I need a span of a few days where I can really get in on that problem. And I
+don't want to be interrupted. And I'm going to just be completely obsessed with that problem.
+And that's where I do most of my good workouts. You've done a bunch of cool, like little projects
+in a very short amount of time very quickly. So that requires you just focusing on it.
+Yeah, basically, I need to load my working memory with the problem. And I need to be productive
+because there's always a huge fixed cost to approaching any problem. I was struggling with
+this, for example, at Tesla because I want to work on small side project. But okay, you first need to
+figure out, okay, I need to SSH into my cluster. I need to bring up a VS code editor so I can work
+on this. I run into some stupid error because of some reason. You're not at a point where you can
+be just productive right away. You are facing barriers. And so it's about really removing all
+of that barrier and you're able to go into the problem and you have the full problem loaded in
+your memory. And somehow avoiding distractions of all different forms, like news stories, emails,
+but also distractions from other interesting projects that you previously worked on or
+currently working on and so on. You just want to really focus your mind. And I mean, I can take
+some time off for distractions and in between, but I think it can't be too much. Most of your day is
+sort of spent on that problem. And then I drink coffee, I have my morning routine, I look at some
+news, Twitter, Hacker News, Wall Street Journal, et cetera. It's great. So basically, you wake up,
+you have some coffee. Are you trying to get to work as quickly as possible? Are you taking this diet
+of what the hell is happening in the world first? I do find it interesting to know about the world.
+I don't know that it's useful or good, but it is part of my routine right now. So I do read through
+a bunch of news articles and I want to be informed. And I'm suspicious of it. I'm suspicious of the
+practice, but currently that's where I am. Oh, you mean suspicious about the positive effect
+of that practice on your productivity and your wellbeing? My wellbeing psychologically, yeah.
+And also on your ability to deeply understand the world because there's a bunch of sources of
+information. You're not really focused on deeply integrating. Yeah, it's a little distracting.
+In terms of a perfectly productive day, for how long of a stretch of time in one session do you
+try to work and focus on a thing? A couple of hours, is it one hour, is it 30 minutes, is it
+10 minutes? I can probably go a small few hours and then I need some breaks in between for food
+and stuff. Yeah, but I think it's still really hard to accumulate hours. I was using a tracker
+that told me exactly how much time I spent coding any one day. And even on a very productive day,
+I still spent only like six or eight hours. And it's just because there's so much padding,
+commute, talking to people, food, et cetera. There's like the cost of life, just living
+and sustaining and homeostasis and just maintaining yourself as a human is very high.
+And there seems to be a desire within the human mind to participate in society that creates that
+padding. Because the most productive days I've ever had is just completely from start to finish
+is tuning out everything and just sitting there. And then you could do more than six and eight
+hours. Is there some wisdom about what gives you strength to do tough days of long focus?
+Yeah, just like whenever I get obsessed about a problem, something just needs to work,
+something just needs to exist. It needs to exist. So you're able to deal with bugs and programming
+issues and technical issues and design decisions that turn out to be the wrong ones. You're able
+to think through all of that given that you want to think to exist. Yeah, it needs to exist. And
+then I think to me also a big factor is are other humans are going to appreciate it? Are they going
+to like it? That's a big part of my motivation. If I'm helping humans and they seem happy,
+they say nice things, they tweet about it or whatever, that gives me pleasure because I'm
+doing something useful. So you do see yourself sharing it with the world. Whether it's on GitHub
+or through a blog post or through videos. Yeah, I was thinking about it. Suppose I did all these
+things but did not share them. I don't think I would have the same amount of motivation that
+I can build up. You enjoy the feeling of other people gaining value and happiness from the stuff
+you've created. Yeah. What about diet? I saw you played with intermittent fasting. Do you fast?
+Does that help? I played with everything.
+With the things you played, what's been most beneficial to your ability to mentally focus
+on a thing and just mental productivity and happiness? You still fast? Yeah, I still fast,
+but I do intermittent fasting. But really what it means at the end of the day is I skip breakfast.
+So I do 18, 6 roughly by default when I'm in my steady state. If I'm traveling or doing something
+else, I will break the rules. But in my steady state, I do 18, 6. So I eat only from 12 to 6.
+Not a hard rule and I break it often, but that's my default. And then yeah, I've done a bunch of
+random experiments. For the most part right now, where I've been for the last year and a half,
+I want to say, is I'm plant-based or plant-forward. I heard plant-forward. It sounds better.
+What does that mean exactly? I don't actually know what the difference is,
+but it sounds better in my mind. But it just means I prefer plant-based food.
+Raw or cooked? I prefer cooked and plant-based.
+So plant-based, forgive me, I don't actually know how wide the category of plant entails.
+Well, plant-based just means that you're not militant about it and you can flex.
+You just prefer to eat plants and you're not trying to influence other people.
+And if you come to someone's house party and they serve you a steak that they're really proud of,
+you will eat it. That's beautiful. I'm on the flip side of that, but I'm very sort of flexible.
+Have you tried doing one meal a day? I have accidentally, not consistently,
+but I've accidentally had that. I don't like it. I think it makes me feel not good. It's too much,
+too much of a hit. Yeah.
+And so currently I have about two meals a day, 12 and six.
+I do that nonstop. I'm doing it now. I do one meal a day.
+It's interesting. It's an interesting feeling. Have you ever fasted longer than a day?
+Yeah, I've done a bunch of water fasts because I was curious what happens.
+Anything interesting? Yeah, I would say so. I mean,
+what's interesting is that you're hungry for two days and then starting day three or so,
+you're not hungry. It's such a weird feeling because you haven't eaten in a few days and
+you're not hungry. Isn't that weird?
+It's really weird. One of the many weird things about human biology,
+is figure something out. It finds another source of energy or something like that,
+or relaxes the system. I don't know how that works.
+The body is like, you're hungry, you're hungry. And then it just gives up. It's like,
+okay, I guess we're fasting now. There's nothing. And then it just focuses on trying to make you
+not hungry and not feel the damage of that and trying to give you some space to figure out the
+food situation. Are you still to this day most productive at night?
+I would say I am, but it is really hard to maintain my PhD schedule,
+especially when I was working at Tesla and so on. It's a non-starter.
+But even now, people want to meet for various events. Society lives in a certain period of time
+and you sort of have to work with that.
+It's hard to do a social thing and then after that return and do work.
+Yeah. It's just really hard.
+That's why I try when I do social things, I try not to do too much drinking so I can return
+and continue doing work. But at Tesla, is there a convergence, Tesla, but any company,
+is there a convergence towards a schedule? Or is there more? Is that how humans behave
+when they collaborate? I need to learn about this. Do they try to keep a consistent schedule
+where you're all awake at the same time? I do try to create a routine and I try to
+create a steady state in which I'm comfortable in. I have a morning routine, I have a day routine,
+I try to keep things to a steady state and things are predictable. And then your body just
+sticks to that. And if you try to stress that a little too much, it will create,
+when you're traveling and you're dealing with jet lag, you're not able to really ascend
+to where you need to go. Yeah. That's what you're doing with humans with the habits and stuff.
+What are your thoughts on work-life balance throughout a human lifetime?
+So Tesla in part was known for pushing people to their limits in terms of what they're able to do,
+in terms of what they're trying to do, in terms of how much they work, all that kind of stuff.
+Yeah. I will say Tesla gets a little too much bad rep for this because what's happening is Tesla,
+it's a bursty environment. So I would say the baseline, my only point of reference is Google,
+where I've interned three times and I saw what it's like inside Google and DeepMind. I would
+say the baseline is higher than that, but then there's a punctuated equilibrium where once in
+a while there's a fire and people work really hard. And so it's spiky and bursty and then all
+the stories get collected. About the bursts. And then it gives the appearance of total insanity,
+but actually it's just a bit more intense environment and there are fires and sprints.
+And so I think definitely though I would say it's a more intense environment than something
+you would get. But in your personal, forget all of that, just in your own personal life,
+what do you think about the happiness of a human being? A brilliant person like yourself,
+about finding a balance between work and life or is it such a thing, not a good thought experiment?
+Yeah, I think balance is good, but I also love to have sprints that are out of distribution.
+And that's when I think I've been pretty creative as well. Sprints out of distribution means that
+most of the time you have a quote unquote balance. I have balance most of the time.
+I like being obsessed with something once in a while. Once in a while is what? Once a week,
+once a month, once a year? Yeah, probably like say once a month or something. Yeah.
+And that's when we get a new GitHub repo for monitoring. Yeah, that's when you really care
+about a problem. It must exist. This will be awesome. You're obsessed with it. And now you
+can't just do it on that day. You need to pay the fixed cost of getting into the groove. And then
+you need to stay there for a while and then society will come and they will try to mess with you and
+they will try to distract you. Yeah. The worst thing is a person who's like, I just need five
+minutes of your time. Yeah. The cost of that is not five minutes and society needs to change how
+it thinks about it. Just five minutes of your time. Right. It's never just one minute. Just
+30 seconds. Just a quick thing. What's the big deal? Why are you being so... Yeah, no.
+What's your computer setup? What's like the perfect... Are you somebody that's flexible
+to no matter what? Laptop, four screens. Yeah. Or do you prefer a certain setup that you're most
+productive? I guess the one that I'm familiar with is one large screen, 27 inch, and my laptop
+on the side. What operating system? I do Macs. That's my primary. For all tasks? I would say
+OS X, but when you're working on deep learning, everything is Linux. You're SSH'd into a cluster
+and you're working remotely. But what about the actual development? Like they're using the IDE?
+I think a good way is you just run VS code, my favorite editor right now, on your Mac,
+but you have a remote folder through SSH. The actual files that you're manipulating
+are on the cluster somewhere else. What's the best IDE? VS code. What else do people... I use
+Emacs still. That's cool. It may be cool. I don't know if it's maximum productivity.
+What do you recommend in terms of editors? You worked a lot of software engineers. Editors for
+Python, C++, machine learning applications. I think the current answer is VS code. Currently,
+I believe that's the best IDE. It's got a huge amount of extensions. It has GitHub Copilot
+integration, which I think is very valuable. What do you think about the Copilot integration? I
+was actually... I got to talk a bunch with Guido Narrazzon, who's a creative Python, and he loves
+Copilot. He programs a lot with it. Do you? Yeah, I use Copilot. I love it. It's free for me,
+but I would pay for it. Yeah, I think it's very good. The utility that I found with it was...
+I would say there's a learning curve, and you need to figure out when it's helpful and when to pay
+attention to its outputs and when it's not going to be helpful, where you should not pay attention
+to it. Because if you're just reading at suggestions all the time, it's not a good way of interacting
+with it. But I think I was able to mold myself to it. I find it's very helpful, number one,
+copy, paste, and replace some parts. When the pattern is clear, it's really good at completing
+the pattern. And number two, sometimes it suggests APIs that I'm not aware of. It tells you about
+something that you didn't know. And that's an opportunity to discover and use it again.
+It's an opportunity to... I would never take Copilot code as given. I almost always copy
+a copy paste into a Google search, and you see what this function is doing. And then you're like,
+oh, it's actually exactly what I need. Thank you, Copilot. So you learn something. It's in part a
+search engine, part maybe getting the exact syntax correctly that once you see it, it's that NP
+hard thing. Once you see it, you know it's correct, but you yourself struggle. You can verify
+efficiently, but you can't generate efficiently. And Copilot really, I mean, it's autopilot for
+programming, right? And currently it's doing the link following, which is like the simple copy,
+paste, and sometimes suggest. But over time, it's going to become more and more autonomous.
+And so the same thing will play out in not just coding, but actually across many,
+many different things probably. Coding is an important one, right? Like writing programs.
+How do you see the future of that developing? The program synthesis, like being able to write
+programs that are more and more complicated. Because right now it's human supervised in
+interesting ways. It feels like the transition will be very painful.
+My mental model for it is the same thing will happen as with the autopilot. So currently
+it's doing link following, it's doing some simple stuff. And eventually we'll be doing autonomy and
+people will have to intervene less and less. And there could be like testing mechanisms.
+Like if it writes a function and that function looks pretty damn correct, but how do you know
+it's correct? Because you're getting lazier and lazier as a programmer. Like your ability to,
+because like little bugs, but I guess it won't make little mistakes.
+No, it will. Copilot will make off by one subtle bugs. It has done that to me.
+But do you think future systems will? Or is it really the off by one is actually a fundamental
+challenge of programming? In that case, it wasn't fundamental. And I think things can improve, but
+yeah, I think humans have to supervise. I am nervous about people not supervising what comes out
+and what happens to, for example, the proliferation of bugs in all of our systems.
+I'm nervous about that, but I think there will probably be some other copilots for bug finding
+and stuff like that at some point. Cause there'll be like a lot more automation for.
+It's like a program, a copilot that generates a compiler, one that does a linter, one that does
+like a type checker. It's a committee of like a GPT sort of like. And then there'll be like a manager
+for the committee. And then there'll be somebody that says a new version of this is needed. We need
+to regenerate it. Yeah. There were 10 GPTs. They were forwarded and gave 50 suggestions. Another
+one looked at it and picked a few that they like. A bug one looked at it and it was like, it's
+probably a bug. They got re-ranked by some other thing. And then a final ensemble GPT comes in.
+It's like, okay, given everything you guys have told me, this is probably the next token.
+The feeling is the number of programmers in the world has been growing and growing very quickly.
+Do you think it's possible that it'll actually level out and drop to like a very low number
+with this kind of world? Cause then you'll be doing software 2.0 programming.
+And you'll be doing this kind of generation of copilot type systems programming,
+but you won't be doing the old school software 1.0 program.
+I don't currently think that they're just going to replace human programmers.
+I'm so hesitant saying stuff like this, right?
+This is going to be replaced in five years. I don't know. It's going to show that this is where
+we thought. Cause I agree with you, but I think we might be very surprised. What's your sense of
+where we stand with language models? Does it feel like the beginning or the middle or the end?
+The beginning, a hundred percent. I think the big question in my mind is for sure GPT will be able
+to program quite well, competently and so on. How do you steer the system? You still have to provide
+some guidance to what you actually are looking for. And so how do you steer it? And how do you
+talk to it? How do you audit it and verify that what is done is correct? And how do you work with
+this? And it's as much not just an AI problem, but a UI UX problem. So beautiful fertile ground for
+so much interesting work for VS code plus plus where it's not just human programming anymore.
+It's amazing. Yeah. So you're interacting with the system. So not just one prompt,
+but it's iterative prompting. You're trying to figure out having a conversation with the system.
+Yeah. That actually, I mean, to me, that's super exciting to have a conversation with the program
+I'm writing. Yeah. Maybe at some point you're just conversing with it. It's like, okay, here's what I
+want to do. Actually this variable, maybe it's not even that low level as variable, but. You can also
+imagine like, can you translate this to C plus plus and back to Python? Yeah, that already kind of
+exists in some. No, but just like doing it as part of the program experience. Like I think I'd like
+to write this function in C plus plus or like you just keep changing for different, uh, different
+programs because of different six, six syntax. Maybe I want to convert this into a functional
+language. And so like you get to become multilingual as a programmer and dance back and forth
+efficiently. Yeah. I mean, I think the UI UX of it though is like still very hard to think through
+because it's not just about writing code on a page. You have an entire developer environment.
+You have a bunch of hardware on it. Uh, you have some environmental variables. You have some scripts
+that are running in a Chrome job. Like there's a lot going on to like working with computers and how
+do these systems set up environment flags and work across multiple machines and set up screen
+sessions and automate different processes. Like how all that works and is auditable by humans and
+so on is like massive question. No, my man. You've built archive sanity. What is archive
+and what is the future of academic research publishing that you would like to see?
+Uh, so archive is this pre print server. So if you have a paper, uh, you can submit it for
+publication to journals or conferences and then wait six months and then maybe get a decision,
+pass or fail, or you can just upload it to archive and then people can tweet about it three minutes
+later and then everyone sees it, everyone reads it and everyone can profit from it, uh, in their own
+way. So you can cite it and it has an official look to it. It feels like a pub, like it feels
+like a publication process. It feels different than you if you just put it in a blog post.
+Oh yeah. Yeah. I mean, it's a paper and usually the bar is higher for something that you would
+expect on archive as opposed to something you would see in a blog post. Well, the culture
+created the bar because you could probably post a pretty crappy picture on the archive.
+Yes. Um, so what, what's that make you feel like? What's that make you feel about peer review?
+So rigorous peer review by two, three experts versus the peer review of the community
+right as it's written. Yeah. Basically I think the community is very well able to peer review
+things very quickly on Twitter. And I think maybe it just has to do something with AI machine
+learning fields specifically though. I feel like things are more easily auditable. Um, and the
+verification is easier potentially than the verification somewhere else. So it's kind of
+like, um, you can think of these, uh, scientific publications as like little blockchains where
+everyone's building on each other's work and setting each other. And you sort of have AI,
+which is kind of like this much faster and loose blockchain, but then you have any one individual
+entry is like very, um, very cheap to make. And then you have other fields where maybe that
+model doesn't make as much sense. Um, and so I think in AI, at least things are pretty easily
+very viable. And so that's why when people upload papers, they're a really good idea and so on,
+people can try it out like the next day and they can be the final arbiter of whether it works or
+not on their problem. And the whole thing just moves significantly faster. So I kind of feel like
+academia still has a place. Sorry, this like conference journal process still has a place,
+but it's sort of like, um, it lags behind, I think. And it's a bit more, um, maybe higher quality
+process. Uh, but it's not sort of the place where you will discover cutting edge work anymore.
+Yeah. It used to be the case when I was starting my PhD, that you go to conferences and journals
+and you discuss all the latest research. Now, when you go to a conference or general, like no
+one discusses anything that's there because it's already like three generations ago irrelevant.
+Yeah. Which makes me sad about like DeepMind, for example, where they, they still, they still
+publish in nature and these big prestigious, I mean, there's still value, I suppose to the prestige
+that comes with these big venues, but the result is that they, they'll announce some breakthrough
+performance and it will take like a year to actually publish the details. I mean,
+and those details in, if they were published immediately, it would inspire the community
+to move in certain directions. Yeah, it would speed up the rest of the community,
+but I don't know to what extent that's part of their objective function also.
+That's true. So it's not just the prestige, a little bit of the delay is, uh, as part of.
+Yeah, they certainly, uh, DeepMind specifically has been, um, working in the regime of having
+a slightly higher quality, basically process and latency and, uh, publishing those papers that way.
+Another question from Reddit. Do you, or have you suffered from imposter syndrome? Being the director
+of AI Tesla, uh, being this person when you're at Stanford, where like the world looks at you
+as the expert in AI to teach, teach the world about machine learning. When I was leaving
+Tesla after five years, I spent a ton of time in meeting rooms. Uh, and you know, I would read
+papers in the beginning when I joined Tesla, I was writing code and then I was writing less
+and less code and I was reading code and then I was reading less and less code. And so this is just
+a natural progression that happens, I think. And, uh, definitely I would say near the tail end.
+That's when it sort of like starts to hit you a bit more that you're supposed to be an expert,
+but actually the source of truth is the code that people are writing, the GitHub and the actual,
+the actual code itself. Uh, and you're not as familiar with that as you used to be.
+And so I would say maybe there's some like insecurity there.
+Yeah, that's actually pretty profound that a lot of the insecurity has to do with not writing the
+code in the computer science space like that, cause that is the truth that, that right there.
+The code is the source of truth, the papers and everything else. It's a high level summary.
+I don't, uh, yeah, just a high level summary, but at the end of the day, you have to read code.
+It's impossible to translate all that code into actual, uh, you know, uh, paper form. Uh, so when,
+when things come out, especially when they have a source code available, that's my favorite place
+to go. So like I said, you're one of the greatest teachers of machine learning AI ever, uh, from CS
+231N to today. What advice would you give to beginners interested in getting into machine
+learning? Beginners are often focused on like what to do. And I think the focus should be more like
+how much you do. So I am kind of like believer on a high level in this 10,000 hours kind of concept
+where you just kind of have to just pick the things where you can spend time and you care about and
+you're interested in. You literally have to put in 10,000 hours of work. Um, it doesn't even like
+matter as much like where you put it and you're, you'll iterate and you'll improve and you'll
+waste some time. I don't know if there's a better way you need to put in 10,000 hours, but I think
+it's actually really nice because I feel like there's some sense of determinism about, uh,
+being an expert at a thing. If you spend 10,000 hours, you can literally pick an arbitrary thing.
+And I think if you spend 10,000 hours of deliberate effort and work, you actually will become an
+expert at it. And so I think it's kind of like a nice thought. Um, and so, uh, basically I would
+focus more on like, are you spending 10,000 hours? That's what I'm focused on. So, and then thinking
+about what kind of mechanisms maximize your likelihood of getting to 10,000 hours, which
+for us silly humans means probably forming a daily habit of like every single day,
+actually doing the thing, whatever helps you. So I do think to a large extent is a psychological
+problem for yourself. Uh, one other thing that I help that I think is helpful for the psychology
+of it is many times people compare themselves to others in the area. I think this is very harmful
+only compare yourself to you from some time ago, like say a year ago, are you better than you
+year ago? This is the only way to think. Um, and I think this, then you can see your progress and
+it's very motivating. That's so interesting that focus on the quantity of hours. Cause I think a
+lot of people, uh, in the beginner stage, but actually throughout get paralyzed, uh, by, uh,
+the choice, like which one do I pick this path or this path? Like they'll literally get paralyzed,
+but like which ID to use. Well, they're worried. Yeah. They'll worried about all these things,
+but the thing is some of the, you will waste time doing something wrong. You will eventually
+figure out it's not right. You will accumulate scar tissue and next time you'll grow stronger
+because next time you'll have the scar tissue and next time you'll learn from it. And now next time
+come to a similar situation, you'll be like, Oh, I, I messed up. I've spent a lot of time working
+on things that never materialized into anything. And I have all that scar tissue and I have some
+intuitions about what was useful, what wasn't useful, how things turned out. Uh, so all those
+mistakes were, uh, were not dead work, you know? So I just think you should, did you just focus on
+working? What have you done? What have you done last week? Uh, that's a good question actually to
+ask for, for a lot of things, not just machine learning. Um, it's a good way to cut the,
+the, I forgot what the term we use, but the fluff, the blubber, whatever the,
+uh, the inefficiencies in life. Uh, what do you love about teaching? You seem to find yourself
+often in the, like draw onto teaching. You're very good at it, but you're also drawn to it.
+I mean, I don't think I love teaching. I love happy humans and happy humans like when I teach.
+I wouldn't say I hate teaching. I tolerate teaching, but it's not like the act of teaching
+that I like. It's, it's that, um, you know, I, I have some, I have something I'm actually okay at
+it. I'm okay at teaching and people appreciate it a lot. And, uh, so I'm just happy to try to be
+helpful and, uh, teaching itself is not like the most, I mean, it's really annoying. It can be
+really annoying, frustrating. I was working on a bunch of lectures just now. I was reminded back to
+my days of 231 and just how much work it is to create some of these materials and make them good.
+The amount of iteration and thought, and you go down blind alleys and just how much you change it.
+So creating something good, um, in terms of like educational value is really hard and, uh, it's not
+fun. It was difficult. So for people to definitely go watch your new stuff, you put out, there are
+lectures where you're actually building the thing like from, like you said, the code is truth. So
+discussing, uh, backpropagation by building it, by looking through it and just the whole thing.
+So how difficult is that to prepare for? I think that's a really powerful way to teach.
+Did you have to prepare for that or are you just live thinking through it?
+I will typically do like say three takes and then I take like the better take. Uh, so I do multiple
+takes and I take some of the better takes and then I just build out a lecture that way. Uh,
+sometimes I have to delete 30 minutes of content because it just went down the alley that I didn't
+like too much. There's a bunch of iteration and it probably takes me, you know, somewhere around
+10 hours to create one hour of content. To get one hour. It's interesting. I mean, uh,
+is it difficult to go back to the basics? Do you draw a lot of like wisdom from going back to the
+basics? Yeah. Going back to backpropagation loss functions, where they come from. And one thing
+I like about teaching a lot honestly is it definitely strengthens your understanding.
+So it's not a purely altruistic activity. It's a way to learn. If you have to explain
+something to someone, uh, you realize you have gaps in knowledge. Uh, and so I even
+surprised myself in those lectures. Like, oh, the result will obviously look at this and then the
+result doesn't look like it. And I'm like, okay, I thought I understood this. Yeah.
+But that's why it's really cool. Literally code, you run it in the notebook and it gives you a
+result and you're like, oh, wow. Yes. And like actual numbers, actual input, actual code.
+Yeah. It's not mathematical symbols, et cetera. The source of truth is the code. It's not slides.
+It's just like, let's build it. It's beautiful. You're a rare human in that sense. Uh, what
+advice would you give to researchers, uh, trying to develop and publish idea that have a big impact
+in the world of AI? So maybe, um, undergrads, maybe early graduate students. Yep. I mean,
+I would say like, they definitely have to be a little bit more strategic than I had to be as a
+PhD student because of the way AI is evolving. It's going the way of physics, where, you know,
+in physics, you used to be able to do experiments on your bench top and everything was great and
+you could make progress. And now you have to work in like LHC or like CERN. And, and so AI is going
+in that direction as well. Um, so there's certain kinds of things that's just not possible to do on
+the bench top anymore. And, uh, I think, um, that didn't used to be the case at the time.
+Do you still think that there's like, GAN type papers to be written where like, uh, like very
+simple idea that requires just one computer to illustrate a simple example? I mean, one example
+that's been very influential recently is diffusion models. The fusion models are amazing. The fusion
+models are six years old. Uh, for the longest time, people were kind of ignoring them as far
+as I can tell. And, uh, they're an amazing generative model, especially in, uh, in images.
+And so stable diffusion and so on. It's all diffusion based. Uh, the fusion is new. It was
+not there and came from, well, it came from Google, but a researcher could have come up with it. In
+fact, some of the first actually know those came from Google as well. Uh, but a researcher could
+come up with that in an academic institution. Yeah. What do you find most fascinating about
+diffusion models? So from the societal impact of the technical architecture, what I like about
+the fusion is it works so well. Was that surprising to you? The amount of the variety, almost the
+novelty of the synthetic data is generating. Yeah. So the stable diffusion images are incredible.
+It's the speed of improvement in generating images has been insane. Uh, we went very quickly
+from generating like tiny digits to tiny faces and it all looked messed up. And now we were stable
+diffusion and that happened very quickly. There's a lot that academia can still contribute. Uh,
+you know, for example, um, flash attention is a very efficient kernel for running the attention
+operation inside the transformer that came from academic environment. It's a very clever way to
+structure the kernel, uh, that do the best calculation. So it doesn't materialize the
+attention matrix. Um, and so there's, I think there's still like lots of things to contribute,
+but you have to be just more strategic. Do you think neural networks can be made to reason?
+Uh, yes. Do you think they already reason? Yes. What's your definition of reasoning? Uh,
+information processing.
+So in the way that humans think through a problem and come up with novel ideas,
+it, it feels like reasoning. Yeah. So the, the novelty,
+I don't want to say, but out of, out of distribution ideas, you think it's possible?
+Yes. And I think we're seeing that already in the current neural nets. You're able to remix the
+training set information into true generalization in some sense. That doesn't appear in a fundamental
+way in the training set. Like you're doing something interesting algorithmically, you're
+manipulating, you know, some symbols and you're coming up with some correct, unique answer in a
+new setting. What would, uh, illustrate to you, holy shit, this thing is definitely thinking.
+To me, thinking or reasoning is just information processing and generalization. And I think the
+neural nets already do that today. So being able to perceive the world or perceive the,
+whatever the inputs are and to make predictions based on that or actions based on that, that's,
+that's the reason. Yeah. You're giving correct answers in novel settings, uh, by manipulating
+information. You've learned the correct algorithm. You're not doing just some kind of a lookup table
+on the Earth's neighbor search. Something like that. Let me ask you about AGI. What, what are some
+moonshot ideas you think might make significant progress towards AGI? Or maybe another way is
+what are the big blockers that we're missing now? So basically I am fairly bullish on our ability to
+build AGI's, uh, basically automated systems that we can interact with that are very human-like
+and we can interact with them in the digital realm or physical realm. Currently, it seems
+most of the models that sort of do these sort of magical tasks are in a text realm. Um, I think,
+as I mentioned, I'm suspicious that the text realm is not enough to actually build full
+understanding of the world. I do actually think you need to go into pixels and understand the
+physical world and how it works. So I do think that we need to extend these models to consume
+images and videos and train on a lot more data that is multimodal in that way. Do you think you
+need to touch the world to understand it also? Well, that's the big open question I would say
+in my mind is if you also require the embodiment and the ability to, uh, sort of, sort of interact
+with the world, run experiments and, um, have a data of that form, then you need to go to optimist
+or something like that. And so I would say optimist in some way is like a hedge, um,
+in AGI because it seems to me that it's possible that just having data from the internet is not
+enough. If that is the case, then optimist may lead to AGI, uh, because optimist would, I, to me,
+there's nothing beyond optimist. You have like this humanoid form factor that can actually like
+do stuff in the world. You can have millions of them interacting with humans and so on. And, uh,
+if that doesn't give rise to AGI at some point, like I'm not sure what will. Um, so from a
+completeness perspective, I think that's the, uh, that's a really good platform, but it's a much
+harder platform because, uh, you are dealing with atoms and you need to actually like build these
+things and integrate them into society. So I think that path takes longer, uh, but it's much
+more certain. And then there's a path of the internet and just like training these compression
+models effectively, uh, on a trend compress all the internet. And, uh, that might also give, um,
+these agents as well. Compress the internet, but also interact with the internet. So it's not
+obvious to me. In fact, I suspect you can reach AGI without ever entering the physical world.
+And the, which is a little bit more, uh, concerning because it might, that results in it happening
+faster. So it just feels like we're in, like in boiling water. We won't know as it's happening.
+I would like to, I'm not afraid of AGI. I'm excited about it. There's always concerns,
+but I would like to know when it happens. Yeah. Or it have like hints about when it happens, like
+a year from now, it will happen. That kind of thing. I just feel like in the digital realm,
+it just might happen. Yeah. I think all we have available to us because no one has built AGI
+again. So all we have available to us is, uh, is there enough fertile ground on the periphery?
+I would say yes. And we have the progress so far, which has been very rapid and, uh, there are next
+steps that are available. And so I would say, uh, yeah, it's quite likely that we'll be interacting
+with digital entities. How will you know that somebody has built AGI? It's going to be a slow,
+I think it's going to be a slow incremental transition is going to be product based and
+focused. It's going to be GitHub co-pilot getting better. And then, uh, GPT is helping you right.
+And then these oracles that you can go to with mathematical problems, I think we're on a,
+on a verge of being able to ask very complex questions in chemistry, physics, math,
+of these oracles and have them complete solutions. So AGI to use primarily focused on intelligence.
+So consciousness doesn't enter into, uh, into it. So in my mind, consciousness is not a special
+thing you will, you will figure out and bolt on. I think it's an emerging phenomenon of a
+large enough and complex enough, um, generative model sort of. So, um, if you have a complex
+enough world model, uh, that understands the world, then it also understands its predicament
+in the world as being a language model, which to me is a form of consciousness or self-awareness.
+And so in order to understand the world deeply, you probably have to integrate yourself into the
+world. And in order to interact with humans and other living beings, consciousness is a very
+useful tool. I think consciousness is like a modeling insight. Modeling insight. Yeah. It's a,
+you have a powerful enough model of understanding the world that you actually understand that you
+are an entity in it. Yeah. But there's also this, um, perhaps just the narrative we tell ourselves.
+There's a, it feels like something to experience the world, the hard problem of consciousness,
+but that could be just a narrative that we tell ourselves. Yeah. I don't think we'll,
+yeah, I think it will emerge. I think it's going to be something very boring. Like we'll be talking
+to these digital AIs, they will claim they're conscious. They will appear conscious. They will
+do all the things that you would expect of other humans. And, uh, it's going to just be a stalemate.
+I think there'll be a lot of actual fascinating ethical questions, like Supreme Court level
+questions of whether you're allowed to turn off a conscious AI. If you're allowed to build a
+conscious AI, maybe there would have to be the same kind of debate that you have around
+um, sorry to bring up a political topic, but you know, abortion, uh, which is the deeper question
+with abortion, uh, is what is life? And the deep question with AI is also what is life and what is
+conscious? And I think that'll be very fascinating to bring up. It might become illegal to build
+systems that are capable like of such level of intelligence that consciousness would emerge.
+And therefore the capacity to suffer would emerge and somebody, a system that says, no,
+please don't kill me. Well, that's what the Lambda compute, the Lambda chatbot already told,
+um, this Google engineer, right? Like it was talking about not wanting to die or so on.
+So that might become illegal to do that. Right.
+Cause otherwise you might have a lot of, a lot of creatures that don't want to die
+and they will, uh, you can just spawn infinity of them on a cluster.
+And then that might lead to like horrible consequences. Cause then there might be a lot
+of people that secretly love murder and then we'll start practicing murder on those systems.
+I mean, there's just, I, to me, all of this stuff just brings a beautiful mirror to the human
+condition and human nature. We'll get to explore it. And that's what like the best of, uh, the
+Supreme court of all the different debates we have about ideas of what it means to be human.
+We get to ask those deep questions that we've been asking throughout human history.
+There's always been the other in human history. Uh, we're the good guys and that's the bad guys.
+And we're going to, uh, you know, throughout human history, let's murder the bad guys.
+And the same will probably happen with robots. It'll be the other at first. And then we'll get
+to ask questions of what does it mean to be alive? What does it mean to be conscious?
+Yep. And I think there's some canary in the coal mines, even with what we have today.
+And, uh, you know, for example, these, there's these like waifus that you can like work with.
+And some people are trying to like, this company is going to shut down, but this person really like,
+love their waifu and like, it's trying to like port it somewhere else. And like, it's not possible.
+And like, I think like definitely, uh, people will have feelings towards, uh, towards these,
+um, systems because in some sense they are like a mirror of humanity because they are like sort of
+like a big average of humanity in a way that it's trained. But we can, that average,
+we can actually watch. There's, it's nice to be able to interact with the big average of humanity
+and do like a search query on it. Yeah. Yeah. It's very fascinating. And, uh, we can of course,
+also like shape it. It's not just a pure average. We can mess with the training data. We can mess
+with the objective. We can fine tune them in various ways. Uh, so we have some, um, you know,
+impact on what those systems look like. If you want to achieve AGI, um, and you could, uh, have
+a conversation with her and ask her, uh, talk about anything, maybe ask her a question. What,
+what kind of stuff would you, would you ask? I would have some practical questions in my mind,
+like, uh, do I or my loved ones really have to die? Uh, what can we do about that?
+Do you think it will answer clearly or would it answer poetically?
+I would expect it to give solutions. I would expect it to be like, well, I've read all of
+these textbooks and I know all these things that you've produced. And it seems to me like,
+here are the experiments that I think it would be useful to run next. And here's some gene
+therapies that I think would be helpful. And, uh, here are the kinds of experiments that you should
+run. Okay. Let's go with this thought experiment. Okay. Imagine that mortality is actually, uh,
+pre like a prerequisite for happiness. So if we become immortal, we'll actually become deeply
+unhappy and the model is able to know that. So what is this supposed to tell you? Stupid human
+about it. Yes, you can become a mortal, but you will become deeply unhappy. If, if the models,
+if the AGI system is trying to empathize with you human, what is this supposed to tell you that?
+Yes, you don't have to die, but you're really not going to like it. Is that, is it going to be
+deeply honest? Like there's a interstellar. What is it? The AI says like humans want 90% honesty.
+Yeah. So like you have to pick how honest do I want to answer these practical questions?
+Yeah. I love AI and interstellar by the way. I think it's like such a sidekick to the entire story,
+but at the same time, it's like really interesting. It's kind of limited in certain ways,
+right? Yeah, it's limited. And I think that's totally fine by the way. I don't think, uh,
+I think it's fine and plausible to have a limited and imperfect AGI.
+Is that the feature almost as an example, like it has a fixed amount of compute on its physical
+body. And it might just be that even though you can have a super amazing mega brain,
+super intelligent AI, you also can have like, you know, less intelligent as they can deploy
+in a power efficient way. And then they're not perfect. They might make mistakes.
+No, I meant more like say you had infinite compute and it's still good to make mistakes sometimes
+to integrate yourself. Like, um, what is it going back to goodwill hunting? Uh,
+Robin Williams character says like the human imperfections, that's the good stuff, right?
+Isn't it, isn't that the S like we don't want perfect. We want flaws in part to,
+to form connections with each other because it feels like something you can attach your feelings
+to the, the, the flaws in that same way. You want AI that's flawed. I don't know. I feel like
+perfectionist, but then you're saying, okay, yeah, but that's not AGI, but see AGI would need to be
+intelligent enough to give answers to humans that humans don't understand. And I think perfect isn't
+something humans can't understand because even science doesn't give perfect answers. There's
+always gabs and mysteries and I don't know. I, I don't know if humans want perfect.
+Yeah. I could imagine just, um, having a conversation with this kind of oracle entity
+as you'd imagine them. And, uh, yeah, maybe it can tell you about, you know, based on my analysis of
+human condition, um, you might not want this and here are some of the things that might,
+but every, every dumb human will say, yeah, yeah, yeah, yeah. Trust me. I can give me the truth. I
+can handle it, but that's the beauty. Like people can choose. Uh, so, but then
+it's the old marshmallow test with the kids and so on. I feel like too many people,
+like can't handle the truth, probably including myself, like the deep truth of the human
+condition. I don't, I don't know if I can handle it. Like, what if there's some dark stuff? What,
+what if we are an alien science experiment and it realizes that what if it had, I mean,
+I mean, this is the matrix, you know, all over again.
+I don't know. I would, what would I talk about? I don't even, yeah, I, uh, probably I will go
+with the safer scientific questions at first that have nothing to do with my own personal life and
+mortality, just like about physics and so on, uh, to, to build up, like, let's see where it's at,
+or maybe see if it has a sense of humor. That's another question. Would it be able to, uh,
+presumably in order to, if it understands humans deeply, it would be able to generate, uh,
+yeah, to generate humor. Yeah. I think that's actually a wonderful benchmark almost. Like,
+is it able, I think that's a really good point basically to make you laugh. Yeah. If it's able
+to be like a very effective standup comedian, that is doing something very interesting computationally.
+I think being funny is extremely hard. Yeah. Because it's hard in a way, like a touring test,
+the original intent of the touring test is hard because you have to convince humans and there's
+nothing that's why, that's why comedians talk about this. Like there's, this is deeply honest
+because if people can't help but laugh and if they don't laugh, that means you're not funny.
+If they laugh, it's funny. And you're showing, you need a lot of knowledge to create, to create
+humor about like the documentation, human condition and so on. And then you need to be clever with it.
+Uh, you mentioned a few movies you tweeted movies that I've seen five plus times, but
+I'm ready and willing to keep watching interstellar gladiator contact goodwill hunting,
+the matrix, Lord of the rings, all three avatar fifth elements. So on and goes on terminated to
+mean girls. I'm not going to ask about that. I think her man girls is great. Um, what are some
+that jump out to your memory that you love and why you mentioned the matrix
+as a computer person, why do you love the matrix? There's so many properties that make it like
+beautiful and interesting. So, uh, there's all these philosophical questions, but then there was
+also a GIs and there's a simulation and it's cool. And there's, you know, the black, uh, you know,
+the look of it, the feel of it, the feel of it, the action, the bullet time. It was just like
+innovating in so many ways. And then, uh, goodwill goodwill hunting. Why do you like that one?
+Yeah, I just, I really like this, uh, tortured genius sort of character who's like grappling
+with whether or not he has like any responsibility or like what to do with this gift that he was
+given or like how to think about the whole thing. And, uh, there's also a dance between the genius
+and the, the personal, like what it means to love another human being. And there's a lot of things
+there. It's just a beautiful movie. And then the fatherly figure, the mentor in the, in the
+psychiatrist and the, it like really like, uh, it messes with you. You know, there's some movies
+that just like really mess with you, uh, on a deep level. Do you relate to that movie at all?
+No, it's not your fault. As I said, Lord of the Rings, that's self-explanatory. Terminator two,
+which is interesting. You rewatch that a lot. Is that better than Terminator one? You like,
+I do like Terminator one as well. Uh, I like Terminator two a little bit more,
+but in terms of like its surface properties,
+do you think Skynet is at all a possibility? Uh, yes.
+Like the actual sort of, uh, autonomous, uh, weapon system kind of thing. Do you worry about that
+stuff? I do worry. I being useful war. I a hundred percent worry about it. And so the,
+I mean, the, uh, you know, some of these, uh, fears of AGI and how this will plan out, I mean,
+these will be like very powerful entities probably at some point. And so, um, for a long time,
+there are going to be tools in the hands of humans. Uh, you know, people talk about like
+alignment of AGI and how to make the problem is like even humans are not aligned. Uh, so,
+uh, how this will be used and what this is going to look like is, um, yeah, it's troubling. So.
+Do you think it'll happen slowly enough that we'll be able to,
+as a human civilization, think through the problems?
+Yes. That's my hope is that it happens slowly enough and in open enough way where a lot of
+people can see and participate in it. Just figure out how to deal with this transition. I think
+we're just going to be interesting. I draw a lot of inspiration from nuclear weapons
+because I sure thought it would be, it would be fucked once they develop nuclear weapons.
+But like, it's almost like, uh, when, uh, when the systems are not so dangerous,
+they distort human civilization. We deploy them and learn the lessons. And then we quickly
+if it's too dangerous, we'll quickly, quickly, we might still deploy it. Uh, but you very quickly
+learn not to use them. And so there'll be like this balance achieved. Humans are very clever as
+a species. It's interesting. We exploit the resources as much as we can, but we don't,
+we avoid destroying ourselves. It seems like. Well, I don't know about that actually. I hope
+it continues. Um, I mean, I'm definitely like concerned about nuclear weapons and so on,
+not just as a result of the recent conflict, even before that, uh, that's probably my number
+one concern for humanity. So if humanity, uh, destroys itself or destroys, you know, 90%
+of people that would be because of nukes. I think so. Um, and it's not even about the full
+destruction to me. It's bad enough if we reset society, that would be like terrible. It would
+be really bad. And I can't believe we're like so close to it. Yeah. It's like so crazy to me.
+It feels like we might be a few tweets away from something like that. Yep. Basically it's extremely
+unnerving, but it has been for me for a long time. It seems unstable that world leaders,
+just having a bad mood can like, um, take one step towards a bad direction and it escalates.
+Yeah. And because of a collection of bad moods, it can escalate without being able to, um, stop.
+Yeah, it's just, uh, it's a huge amount of, uh, power. And then also with the proliferation,
+I basically, I don't, I don't actually really see, I don't actually know what the good outcomes are
+here. Uh, so I'm definitely worried about that a lot. And then AGI is not currently there,
+but I think at some point will more and more become something like it. The danger with AGI
+even is that I think it's even like slightly worse in the sense that, uh, there are good outcomes of
+AGI and then the bad outcomes are like an Epsilon away, like a tiny one away. And so I think, um,
+capitalism and humanity and so on will drive for the positive, uh, ways of using that technology.
+But then if bad outcomes are just like a tiny, like flip a minus sign away, uh, that's a really
+bad position to be in a tiny perturbation of the system results in the destruction of the human
+species. So we are lying to walk. Yeah. I think in general, it's really weird about like the
+dynamics of humanity and this explosion we've talked about is just like the insane coupling
+afforded by technology and, uh, just the instability of the whole dynamical system.
+I think it's just, it doesn't look good, honestly. Yes. That explosion could be destructive and
+constructive and the probabilities are non-zero in both. Yeah. I mean, I have to, I do feel like I
+have to try to be optimistic and so on. And I think even in this case, I still am predominantly
+optimistic, but there's definitely. Me too. Uh, do you think we'll become a multi-planetary species?
+Probably yes, but I don't know if it's dominant feature of, uh, future humanity. Uh, there might
+be some people on some planets and so on, but I'm not sure if it's like, yeah, if it's like a major
+player in our culture and so on, we still have to solve the drivers of self-destruction here on earth.
+So just having a backup on Mars is not going to solve the problem. So by the way, I love the backup
+on Mars. I think that's amazing. You should absolutely do that. Yes. And I'm so thankful.
+Would you go to Mars? Uh, personally, no, I do like earth quite a lot. Okay. Uh, I'll go to Mars.
+I'll go for you. I'll tweet at you from there. Maybe eventually I would once it's safe enough,
+but I don't actually know if it's on my lifetime scale unless I can extend it by a lot.
+I do think that, for example, a lot of people might disappear into, um, virtual realities and
+stuff like that. And I think that could be the major thrust of, um, sort of the cultural
+development of humanity if it survives. Uh, so it might not be, it's just really hard to work in
+physical realm and go out there. And I think ultimately all your experiences are in your
+brain. And so it's much easier to disappear into digital realm. And I think people will
+find them more compelling, easier, safer, more interesting. So you're a little bit captivated
+by virtual reality, by the possible worlds, whether it's the metaverse or some other
+manifestation of that. Yeah. Yeah. It's really interesting. It's, uh, I'm, I'm interested just,
+just talking a lot to Carmack. Where's the, where's the thing that's currently preventing that?
+Yeah. I mean, to be clear, I think what's interesting about future is, um, it's not that
+I kind of feel like the variance in the human condition grows. That's the primary thing that's
+changing. It's not as much the mean of the distribution is like the variance of it. So
+there will probably be people on Mars and there will be people in VR and there will people here
+on earth. It's just like, there will be so many more ways of being. And so I kind of feel like
+I see it as like a spreading out of a human experience. There's something about the internet
+that allows you to discover those little groups and then you gravitate to something about your
+biology, likes that kind of world that you find each other. Yeah. And we'll have transhumanists
+and then we'll have the Amish and they're going to, everything is just going to coexist.
+You know, the cool thing about it, cause I've interacted with a bunch of internet communities
+is, um, they don't know about each other. Like you can have a very happy existence,
+just like having a very close knit community and not knowing about each other. I mean, even,
+you even sense this, just having traveled to Ukraine, there's, they, they don't know
+so many things about America. You, you like when you travel across the world,
+I think you experienced this too. There are certain cultures that are like,
+they have their own thing going on. They don't. And so you can see that happening more and more
+and more and more in the future. We have little communities. Yeah. Yeah. I think so. That seems
+to be the, that seems to be how it's going right now. And I don't see that trend like really
+reversing. I think people are diverse and they're able to choose their own path and existence.
+And I sort of like celebrate that. Um, and so- Will you spend some much time in the metaverse,
+in the virtual reality or which community area are you the physicalist, uh, the, the,
+the physical reality enjoyer or, uh, do you see drawing a lot of, uh, pleasure and fulfillment
+in the digital world? Yeah, I think, well, currently the virtual reality is not that compelling.
+I do think it can improve a lot, but I don't really know to what extent maybe, you know,
+there's actually like even more exotic things you can think about with like neural links or
+stuff like that. So, um, currently I kind of see myself as mostly a team human person. I love
+nature. I love harmony. I love people. I love humanity. I love emotions of humanity. Um, and
+I, I just want to be like in this like solar punk little utopia. That's my happy place. Yes. My happy
+place is like, uh, people I love thinking about cool problems surrounded by a lush, beautiful,
+dynamic nature and a secretly high tech in places that count places. They use technology to empower
+that love for other humans and nature. Yeah. I think a technology used like very sparingly.
+I don't love when it sort of gets in the way of humanity in many ways. Uh, I like just people
+being humans in a way we sort of like slightly evolved and prefer, I think just by default.
+People kept asking me because they, they know you love reading. Are there particular books
+that you enjoyed that had an impact on you for silly or for profound reasons that you would
+recommend? You mentioned the vital question. Many, of course, I think in biology as an example,
+the vital question is a good one. Anything by Nic Lane, really, uh, life ascending, I would say
+is like a bit more potentially, uh, representative as like a summary of a lot of the things he's been
+about. I was very impacted by the selfish gene. I thought that was a really good book that helped
+me understand altruism as an example and where it comes from. And just realizing that, you know,
+the selection is on the level of genes was a huge insight for me at the time. And it sort of like
+cleared up a lot of things for me. What do you think about the, the idea that ideas are the
+organisms, the meat? Yes, love it. A hundred percent. Are you able to walk around with that
+notion for a while that, that there is an evolutionary kind of process with ideas as well?
+There absolutely is. There's memes just like genes and they compete and they live in our brains.
+It's beautiful. Are we silly humans thinking that we're the organisms? Is it possible that the
+primary organisms are the ideas? Yeah, I would say like the, the ideas kind of live in the software
+of like our civilization in the, in the minds and so on. We think as humans that the hardware is
+the fundamental thing. I human is a hardware entity, but it could be the software, right?
+Yeah. Yeah. I would say like there needs to be some grounding at some point to like a physical
+reality. Yeah. But if we clone an Andre, the software is the thing, like is this thing that
+makes that thing special, right? Yeah, I guess you're right. But then cloning might be exceptionally
+difficult. Like there might be a deep integration between the software and the hardware in ways we
+don't quite understand. Well, from the ultimate point of view, like what makes me special is more
+like the, the gang of genes that are writing in my chromosomes, I suppose, right? Like they're the,
+they're the replicating unit, I suppose. And no, but that's just the thing that makes you special.
+Sure. Well, the reality is what makes you special is your ability to survive
+based on the software that runs on the hardware that was built by the genes.
+So the software is the thing that makes you survive, not the hardware.
+All right. It's a little bit of both. I mean, you know, it's just like a second layer. It's
+a new second layer that hasn't been there before the brain. They both, they both coexist.
+But there's also layers of the software. I mean, it's, it's not, it's a, it's a abstraction on top
+of abstractions. But, okay. So Selfish Gene and Nick Lane, I would say sometimes books are like
+not sufficient. I like to reach for textbooks sometimes. I kind of feel like books are for
+too much of a general consumption sometime. And they just kind of like, they're too high up in
+the level of abstraction and it's not good enough. So I like textbooks. I like The Cell. I think
+The Cell was pretty cool. That's why also I like the writing of Nick Lane is because he's pretty
+willing to step one level down and he doesn't, yeah, he sort of, he's willing to go there.
+But he's also willing to sort of be throughout the stack. So he'll go down to a lot of detail,
+but then he will come back up. And I think he has a, yeah, basically I really appreciate that.
+That's why I love college, early college, even high school, just textbooks on the basics.
+Of computer science, of mathematics, of biology, of chemistry. Those are, they condense down like
+it's sufficiently general that you can understand both the philosophy and the details, but also like
+you get homework problems and you get to play with it as much as you would if you were in
+programming stuff. Yeah. And then I'm also suspicious of textbooks, honestly, because
+as an example in deep learning, there's no like amazing textbooks and I feel this changing very
+quickly. I imagine the same is true in say synthetic biology and so on. These books like The Cell are
+kind of outdated. They're still high level. Like what is the actual real source of truth? It's
+people in wet labs working with cells, sequencing genomes and yeah, actually working with it. And
+I don't have that much exposure to that or what that looks like. So I still don't fully,
+I'm reading through the cell and it's kind of interesting and I'm learning, but it's still not
+sufficient I would say in terms of understanding. Well, it's a clean summarization of the mainstream
+narrative, but you have to learn that before you break out towards the cutting edge. Yeah. But what
+is the actual process of working with these cells and growing them and incubating them? And it's
+kind of like a massive cooking recipes of making sure your cells lives and proliferate and then
+you're sequencing them, running experiments and just how that works, I think is kind of like the
+source of truth of at the end of the day, what's really useful in terms of creating therapies and
+so on. Yeah. I wonder what in the future AI textbooks will be because there's artificial
+intelligence, the modern approach. I actually haven't read if it's come out the recent version,
+there's been a recent addition. I also saw there's a science, a deep learning book. I'm waiting for
+textbooks that are worth recommending, worth reading. It's tricky because it's like papers
+and code, code, code. Honestly, I find papers are quite good. I especially like the appendix of any
+paper as well. It's like the most detail you can have. It doesn't have to be cohesive connected
+to anything else. You just described me a very specific way you saw the particular thing. Yeah.
+Many times papers can be actually quite readable, not always, but sometimes the introduction and
+the abstract is readable even for someone outside of the field. This is not always true. Sometimes
+I think, unfortunately, scientists use complex terms even when it's not necessary. I think that's
+harmful. I think there's no reason for that. Papers sometimes are longer than they need to be in the
+parts that don't matter. Appendix should be long, but then the paper itself, look at Einstein,
+make it simple. Yeah, but certainly I've come across papers I would say in synthetic biology
+or something that I thought were quite readable for the abstract and the introduction. Then you're
+reading the rest of it and you don't fully understand, but you are getting a gist and I
+think it's cool. What advice, you give advice to folks interested in machine learning and research,
+but in general, life advice to a young person in high school, early college about how to have a
+career they can be proud of or a life they can be proud of? Yeah, I think I'm very hesitant to give
+general advice. I think it's really hard. I've mentioned some of the stuff I've mentioned is
+fairly general, I think. Focus on just the amount of work you're spending on a thing.
+Compare yourself only to yourself, not to others. That's good. I think those are fairly general.
+How do you pick the thing? You just have a deep interest in something or try to find the argmax
+over the things that you're interested in. Argmax at that moment and stick with it. How do you not
+get distracted and switch to another thing? You can, if you like.
+If you do an argmax repeatedly every week, every month, it's a problem.
+Yeah, you can low pass filter yourself in terms of what has consistently been true for you.
+I definitely see how it can be hard, but I would say you're going to work the hardest on the thing
+that you care about the most. Low pass filter yourself and really introspect in your past,
+what are the things that gave you energy and what are the things that took energy away from you?
+Concrete examples. Usually from those concrete examples, sometimes patterns can emerge.
+I like it when things look like this when I'm in these positions.
+That's not necessarily the field, but the kind of stuff you're doing in a particular field. For you,
+it seems like you were energized by implementing stuff, building actual things.
+Yeah, being low level, learning, and then also communicating so that others can go through the
+same realizations and shortening that gap. Because I usually have to do way too much work
+to understand a thing. Then I'm like, okay, this is actually like, okay, I think I get it.
+Why was it so much work? It should have been much less work. That gives me a lot of frustration,
+and that's why I sometimes go teach. Aside from the teaching you're doing now,
+putting out videos, aside from a potential Godfather Part II with the AGI at Tesla and beyond,
+what does the future of Ranjha Kapothi hold? Have you figured that out yet or no?
+As you see through the fog of war, that is all of our future. Do you start seeing silhouettes of
+what that possible future could look like? The consistent thing I've been always interested
+in for me at least is AI. That's probably what I'm spending the rest of my life on,
+because I just care about it a lot. I actually care about many other problems as well, like say
+aging, which I basically view as disease. I care about that as well, but I don't think it's a good
+idea to go after it specifically. I don't actually think that humans will be able to come up with the
+answer. I think the correct thing to do is to ignore those problems and you solve AI and then
+use that to solve everything else. I think there's a chance that this will work. I think it's a very
+high chance. That's the way I'm betting at least. When you think about AI, are you interested in
+all kinds of applications, all kinds of domains, and any domain you focus on will allow you to get
+insights to the big problem of AGI? Yeah, for me, it's the ultimate meta problem. I don't want to
+work on any one specific problem. There's too many problems. How can you work on all problems
+simultaneously? You solve the meta problem, which to me is just intelligence, and how do you
+automate it? Is there cool small projects like Archives Sanity and so on that you're thinking
+about that the world, the ML world can anticipate? There's always some fun side projects.
+Archives Sanity is one. Basically, there's way too many archive papers. How can I organize it
+and recommend papers and so on? I transcribed all of your podcasts. What did you learn from that
+experience from transcribing the process of, you like consuming audiobooks and podcasts and so on.
+Here's a process that achieves closer to human level performance and annotation.
+Yeah. Well, I definitely was surprised that transcription with OpenAI's Whisper was
+working so well compared to what I'm familiar with from Siri and a few other systems, I guess.
+It works so well. That's what gave me some energy to try it out. I thought it could be fun to run
+on podcasts. It's not obvious to me why Whisper is so much better compared to anything else,
+because I feel like there should be a lot of incentive for a lot of companies to produce
+transcription systems and that they've done so over a long time. Whisper is not a super exotic
+model. It's a transformer. It takes smell spectrograms and just outputs tokens of text. It's
+not crazy. The model and everything has been around for a long time. I'm not actually 100%
+sure why this game model. Yeah, it's not obvious to me either. It makes me feel like I'm missing
+something. I'm missing something. Yeah, because there is a huge, even Google and so on YouTube
+transcription. Yeah. Yeah, it's unclear, but some of it is also integrating into a bigger system.
+That is the user interface, how it's deployed and all that kind of stuff. Maybe
+running it as an independent thing is much easier, like an order of magnitude easier than deploying
+into a large integrated system like YouTube transcription or anything like meetings. Zoom
+has transcription that's kind of crappy, but creating an interface where it detects the
+different individual speakers, it's able to display it in compelling ways, run it in real time,
+all that kind of stuff. Maybe that's difficult. That's the only explanation I have because
+I'm currently paying quite a bit for human transcription and human captions annotation.
+It seems like there's a huge incentive to automate that. Yeah. It's very confusing.
+I think, I mean, I don't know if you looked at some of the whisper transcripts, but they're
+quite good. They're good. Especially in tricky cases. I've seen
+Whisper's performance on super tricky cases and it does incredibly well. I don't know. A podcast
+is pretty simple. It's like high quality audio and you're speaking usually pretty clearly.
+So I don't know. I don't know what OpenAI's plans are either.
+Yeah. There's always like fun projects basically. StableDiffusion also is opening up a huge amount
+of experimentation, I would say in the visual realm and generating images and videos and movies.
+Videos now. That's going to be pretty crazy. That's going to almost certainly work and is
+going to be really interesting when the cost of content creation is going to fall to zero.
+You used to need a painter for a few months to paint a thing and now it's going to be speak to
+your phone to get your video. Hollywood will start using that to generate scenes,
+which completely opens up. Yeah. So you can make a movie like Avatar eventually for under a million
+dollars. Much less. Maybe just by talking to your phone. I mean, I know it sounds kind of crazy.
+And then there'd be some voting mechanism. Would there be a show on Netflix as
+generated completely automatically? Yeah, potentially. Yeah. And what does it look
+like also when you can just generate it on demand and there's infinity of it?
+Yeah. Oh man. All the synthetic art. I mean, it's humbling because we treat ourselves as special
+for being able to generate art and ideas and all that kind of stuff. If that can be done in an
+automated way by AI. Yeah. I think it's fascinating to me how these, the predictions of AI and what
+is going to look like and what it's going to be capable of are completely inverted and wrong.
+And sci-fi of 50s and 60s was just like totally not right. They imagined AI as like super
+calculating theory approvers and we're getting things that can talk to you about emotions.
+They can do art. It's just like weird. Are you excited about that future? Just
+AI's like hybrid systems, heterogeneous systems of humans and AI's talking about emotions,
+Netflix and children, AI system where the Netflix thing you watch is also generated by AI.
+I think it's going to be interesting for sure. And I think I'm cautiously optimistic, but it's
+not obvious. Well, the sad thing is your brain and mine developed in a time where before Twitter,
+before the internet. So I wonder people that are born inside of it might have a different
+experience. Like I, maybe you can, will still resist it. And the people born now will not.
+Well, I do feel like humans are extremely malleable. Yeah. And you're probably right.
+What is the meaning of life, Andre? We talked about sort of the universe having a conversation
+with us humans or with the systems we create to try to answer for the universe,
+for the creator of the universe to notice us. We're trying to create systems that are loud enough
+to answer back. I don't know if that's the meaning of life. That's like meaning of life for some
+people. The first level answer I would say is anyone can choose their own meaning of life
+because we are a conscious entity and it's beautiful. Number one. But I do think that
+like a deeper meaning of life as someone is interested is along the lines of like,
+what the hell is all this and like, why? And if you look at the inter fundamental physics
+and the quantum field theory and the standard model, they're like very complicated. And
+there's this like 19 free parameters of our universe and like, what's going on with all
+this stuff and why is it here? And can I hack it? Can I work with it? Is there a message for me?
+Am I supposed to create a message? And so I think there's some fundamental answers there
+but I think there's actually even like, you can't actually like really make dent in those
+without more time. And so to me also there's a big question around just getting more time honestly.
+Yeah, that's kind of like what I think about quite a bit as well.
+So kind of the ultimate, or at least first way to sneak up to the why question is to try to escape
+the system, the universe. And then for that, you sort of backtrack and say, okay, for that,
+that's going to be take a very long time. So the why question boils down from an engineering
+perspective to how do we extend? Yeah, I think that's the question number one, practically
+speaking, because you can't, you're not going to calculate the answer to the deeper questions
+in time you have. And that could be extending your own lifetime or extending just the lifetime of
+human civilization of whoever wants to not many people might not want that. But I think people
+who do want that, I think, I think it's probably possible. And I don't think I don't know that
+people fully realize this, I kind of feel like people think of death as an inevitability. But
+at the end of the day, this is a physical system, some things go wrong. It makes sense why
+things like this happen, evolutionary speaking. And there's most certainly interventions that
+mitigate it. That'd be interesting if death is eventually looked at as, as a fascinating thing
+that used to happen to humans. I don't think it's unlikely. I think it's, I think it's likely.
+And it's up to our imagination to try to predict what the world without death looks like.
+Yeah, it's hard to, I think the values will completely change.
+Could be. I don't, I don't really buy all these ideas that, oh, without death, there's no meaning,
+there's nothing as I don't intuitively buy all those arguments. I think there's plenty of meaning,
+plenty of things to learn. They're interesting, exciting, I want to know, I want to calculate,
+I want to improve the condition of all the humans and organisms that are alive.
+Yeah, the way we find meaning might change. We, there is a lot of humans, probably including
+myself, that finds meaning in the finiteness of things. But that doesn't mean that's the
+only source of meaning. Yeah. I do think many people will, will go with that, which I think
+is great. I love the idea that people can just choose their own adventure. Like you, you are
+born as a conscious free entity by default, I'd like to think. And you have your unalienable
+rights for life. In the pursuit of happiness. I don't know if you have that in the nature,
+the landscape of happiness. You can choose your own adventure mostly. And that's not,
+it's not fully true, but I still am pretty sure I'm an NPC, but an NPC can't know it's an NPC.
+Hmm. There could be different degrees and levels of consciousness. I don't think there's a more
+beautiful way to end it. Andre, you're an incredible person. I'm really honored you
+would talk with me. Everything you've done for the machine learning world, for the AI world,
+to just inspire people, to educate millions of people has been, it's been great. And I can't
+wait to see what you do next. It's been an honor, man. Thank you so much for talking today.
+Awesome. Thank you. Thanks for listening to this conversation
+with Andre Karpathy. To support this podcast, please check out our sponsors in the description.
+And now let me leave you with some words from Samuel Carlin. The purpose of models is not to
+fit the data, but to sharpen the questions. Thanks for listening and hope to see you next time.