Spaces:

Ahmad-Moiz
/

Auto-evaluator

Configuration error

App Files Files Community

Auto-evaluator / docs /karpathy-lex-pod /karpathy-pod.txt

Ahmad-Moiz

Upload 2 files

af8bed0 11 months ago

raw

history blame

217 kB

	some kind of a crazy quantum mechanical system that somehow gives you buffer overflow, somehow
	gives you a rounding error in the floating point.
	Synthetic intelligences are kind of like the next stage of development.
	And I don't know where it leads to.
	Like at some point, I suspect the universe is some kind of a puzzle.
	These synthetic AIs will uncover that puzzle and solve it.
	The following is a conversation with Andrei Kapathe, previously the director of AI at
	Tesla, and before that at OpenAI and Stanford.
	He is one of the greatest scientists, engineers, and educators in the history of artificial
	intelligence.
	This is the Lex Friedman podcast.
	To support it, please check out our sponsors.
	And now, dear friends, here's Andrei Kapathe.
	What is a neural network?
	And why does it seem to do such a surprisingly good job of learning?
	What is a neural network?
	It's a mathematical abstraction of the brain.
	I would say that's how it was originally developed.
	At the end of the day, it's a mathematical expression.
	It's a fairly simple mathematical expression when you get down to it.
	It's basically a sequence of matrix multipliers, which are really dot products mathematically,
	and some non-linearity is thrown in.
	It's a very simple mathematical expression, and it's got knobs in it.
	Many knobs.
	Many knobs.
	These knobs are loosely related to the synapses in your brain.
	They're trainable, they're modifiable.
	The idea is we need to find the setting of the knobs that makes the neural net do whatever
	you want it to do, like classify images and so on.
	There's not too much mystery, I would say, in it.
	You might think that you don't want to endow it with too much meaning with respect to the
	brain and how it works.
	It's really just a complicated mathematical expression with knobs, and those knobs need
	a proper setting for it to do something desirable.
	Yeah, but poetry is just a collection of letters with spaces, but it can make us feel a certain
	way.
	In that same way, when you get a large number of knobs together, whether it's inside the
	brain or inside a computer, they seem to surprise us with their power.
	I think that's fair.
	I'm underselling it by a lot because you definitely do get very surprising emergent behaviors
	out of these neural nets when they're large enough and trained on complicated enough problems,
	like say for example, the next word prediction in a massive data set from the internet.
	These neural nets take on pretty surprising magical properties.
	Yeah, I think it's interesting how much you can get out of even very simple mathematical
	formalism.
	When your brain right now is talking, is it doing next word prediction?
	Or is it doing something more interesting?
	Well, it's definitely some kind of a generative model that's a GPT-like and prompted by you.
	So you're giving me a prompt and I'm kind of responding to it in a generative way.
	And by yourself perhaps a little bit?
	Are you adding extra prompts from your own memory inside your head?
	It definitely feels like you're referencing some kind of a declarative structure of memory
	and so on, and then you're putting that together with your prompt and giving away some answer.
	How much of what you just said has been said by you before?
	Nothing basically, right?
	No, but if you actually look at all the words you've ever said in your life and you do a
	search, you'll probably have said a lot of the same words in the same order before.
	Yeah, could be.
	I mean, I'm using phrases that are common, et cetera, but I'm remixing it into a pretty
	unique sentence at the end of the day.
	But you're right, definitely there's a ton of remixing.
	Why, you didn't, it's like Magnus Carlsen said, I'm rated 2,900 whatever, which is pretty
	decent.
	I think you're talking very, you're not giving enough credit to neural nets here.
	Why do they seem to, what's your best intuition about this emergent behavior?
	I mean, it's kind of interesting because I'm simultaneously underselling them, but I also
	feel like there's an element to which I'm over, like it's actually kind of incredible
	that you can get so much emergent magical behavior out of them despite them being so
	simple mathematically.
	So I think those are kind of like two surprising statements that are kind of juxtaposed together.
	And I think basically what it is, is we are actually fairly good at optimizing these neural
	nets.
	And when you give them a hard enough problem, they are forced to learn very interesting
	solutions in the optimization.
	And those solution basically have these emergent properties that are very interesting.
	There's wisdom and knowledge in the knobs.
	And so this representation that's in the knobs, does it make sense to you intuitively
	the large number of knobs can hold the representation that captures some deep wisdom about the data
	it has looked at?
	It's a lot of knobs.
	It's a lot of knobs.
	And somehow, you know, so speaking concretely, one of the neural nets that people are very
	excited about right now are GPTs, which are basically just next word prediction networks.
	So you consume a sequence of words from the internet and you try to predict the next word.
	And once you train these on a large enough data set, you can basically prompt these neural
	nets in arbitrary ways and you can ask them to solve problems and they will.
	So you can just tell them, you can make it look like you're trying to solve some kind
	of a mathematical problem and they will continue what they think is the solution based on what
	they've seen on the internet.
	And very often those solutions look very remarkably consistent, look correct potentially.
	Do you still think about the brain side of it?
	So as neural nets is an abstraction or mathematical abstraction of the brain, you still draw wisdom
	from the biological neural networks or even the bigger question.
	So you're a big fan of biology and biological computation.
	What impressive thing is biology doing to you that computers are not yet?
	That gap?
	I would say I'm definitely on, I'm much more hesitant with the analogies to the brain than
	I think you would see potentially in the field.
	And I kind of feel like certainly the way neural networks started is everything stemmed
	from inspiration by the brain.
	But at the end of the day, the artifacts that you get after training, they are arrived
	at by a very different optimization process than the optimization process that gave rise
	to the brain.
	And so I think, I kind of think of it as a very complicated alien artifact.
	It's something different.
	The brain?
	I'm sorry, the neural nets that we're training.
	They are complicated alien artifact.
	I do not make analogies to the brain because I think the optimization process that gave
	rise to it is very different from the brain.
	There was no multi-agent self-play kind of setup in evolution.
	It was an optimization that is basically what amounts to a compression objective on a massive
	amount of data.
	Okay.
	So artificial neural networks are doing compression and biological neural networks are not really
	doing anything.
	They're an agent in a multi-agent self-play system that's been running for a very, very
	long time.
	That said, evolution has found that it is very useful to predict and have a predictive
	model in the brain.
	And so I think our brain utilizes something that looks like that as a part of it, but
	it has a lot more gadgets and gizmos and value functions and ancient nuclei that are all
	trying to make you survive and reproduce and everything else.
	And the whole thing through embryogenesis is built from a single cell.
	It's just the code is inside the DNA and it just builds it up like the entire organism
	with arms and the head and legs.
	And it does it pretty well.
	It should not be possible.
	So there's some learning going on.
	There's some kind of computation going through that building process.
	I don't know where, if you were just to look at the entirety of history of life on earth,
	what do you think is the most interesting invention?
	Is it the origin of life itself?
	Is it just jumping to eukaryotes?
	Is it mammals?
	Is it humans themselves, almost sapiens?
	The origin of intelligence or highly complex intelligence?
	Or is it all just a continuation of the same kind of process?
	Certainly I would say it's an extremely remarkable story that I'm only briefly learning about
	recently.
	It's a way from actually like you almost have to start at the formation of earth and all
	of its conditions and the entire solar system and how everything is arranged with Jupiter
	and moon and the habitable zone and everything.
	And then you have an active earth that's turning over material.
	And then you start with a bio genesis and everything.
	So it's all like a pretty remarkable story.
	I'm not sure that I can pick like a single unique piece of it that I find most interesting.
	I guess for me as an artificial intelligence researcher, it's probably the last piece.
	We have lots of animals that are not building technological society, but we do.
	And it seems to have happened very quickly.
	It seems to have happened very recently.
	And something very interesting happened there that I don't fully understand.
	I almost understand everything else, I think intuitively, but I don't understand exactly
	that part and how quick it was.
	Both explanations will be interesting.
	One is that this is just a continuation of the same kind of process.
	There's nothing special about humans.
	That would be deeply understanding.
	That would be very interesting that we think of ourselves as special, but it was obvious.
	It was already written in the code that you would have greater and greater intelligence
	emerging.
	And then the other explanation, which is something truly special happened, something like a rare
	event, whether it's like crazy rare event like Space Odyssey.
	What would it be?
	See, if you say like the invention of fire or the, as Richard Wrangham says, the beta
	males deciding a clever way to kill the alpha males by collaborating.
	So just optimizing the collaboration, the multi-agent aspect of the multi-agent and
	that really being constrained on resources and trying to survive the collaboration aspect
	is what created the complex intelligence.
	But it seems like it's a natural algorithm, the evolutionary process.
	What could possibly be a magical thing that happened, like a rare thing that would say
	that humans are actually human level intelligence, actually a really rare thing in the universe?
	Yeah, I'm hesitant to say that it is rare by the way, but it definitely seems like it's
	kind of like a punctuated equilibrium where you have lots of exploration and then you
	have certain leaps, sparse leaps in between.
	So of course, like origin of life would be one, DNA, sex, eukaryotic life, the endosymbiosis
	event where the archaeon ate all bacteria, just the whole thing.
	And then of course, emergence of consciousness and so on.
	So it seems like definitely there are sparse events where a massive amount of progress
	was made, but yeah, it's kind of hard to pick one.
	So you don't think humans are unique?
	I've got to ask you, how many intelligent alien civilizations do you think are out there?
	And is their intelligence different or similar to ours?
	Yeah, I've been preoccupied with this question quite a bit recently, basically the Fermi
	paradox and just thinking through.
	And the reason actually that I am very interested in the origin of life is fundamentally trying
	to understand how common it is that there are technological societies out there in space.
	And the more I study it, the more I think that there should be quite a lot.
	Why haven't we heard from them?
	Because I agree with you.
	It feels like I just don't see why what we did here on Earth is so difficult to do.
	Yeah, and especially when you get into the details of it, I used to think origin of life
	was very, it was this magical rare event, but then you read books like, for example,
	Nic Lane, The Vital Question, Life Ascending, etc.
	And he really gets in and he really makes you believe that this is not that rare.
	Basic chemistry.
	You have an active Earth and you have your alkaline vents and you have lots of alkaline
	waters mixing with the ocean and you have your proton gradients and you have the little
	porous pockets of these alkaline vents that concentrate chemistry.
	And basically as he steps through all of these little pieces, you start to understand that
	actually this is not that crazy.
	You could see this happen on other systems.
	And he really takes you from just a geology to primitive life and he makes it feel like
	it's actually pretty plausible.
	And also like the origin of life was actually fairly fast after formation of Earth.
	If I remember correctly, just a few hundred million years or something like that after
	basically when it was possible, life actually arose.
	So that makes me feel like that is not the constraint, that is not the limiting variable
	and that life should actually be fairly common.
	And then where the drop-offs are is very interesting to think about.
	I currently think that there's no major drop-offs basically, and so there should be quite a
	lot of life.
	And basically where that brings me to then is the only way to reconcile the fact that
	we haven't found anyone and so on is that we just can't, we can't see them.
	We can't observe them.
	Just a quick brief comment.
	Nick Lane and a lot of biologists I talked to, they really seem to think that the jump
	from bacteria to more complex organisms is the hardest jump.
	The eukaryotic life basically.
	Yeah, which I don't, I get it.
	They're much more knowledgeable than me about like the intricacies of biology, but that
	seems like crazy.
	How many single cell organisms are there?
	And how much time you have?
	Surely, it's not that difficult.
	And a billion years is not even that long of a time really.
	Just all these bacteria under constrained resources battling it out.
	I'm sure they can invent more complex.
	I don't understand, it's like how to move from a hello world program to like invent a
	function or something like that.
	I don't.
	Yeah.
	So I don't, yeah, so I'm with you.
	I just feel like I don't see any, if the origin of life, that would be my intuition, that's
	the hardest thing.
	But if that's not the hardest thing, because it happens so quickly, then it's got to be
	everywhere.
	And yeah, maybe we're just too dumb to see it.
	Well, it's just, we don't have really good mechanisms for seeing this life.
	I mean, by what, how, so I'm not an expert just to preface this, but just from what I
	think about it.
	I want to meet an expert on alien intelligence and how to communicate.
	I'm very suspicious of our ability to find these intelligences out there and to find
	these earth, like radio waves, for example, are terrible.
	Their power drops off as basically one over R square.
	So I remember reading that our current radio waves would not be, the ones that we are broadcasting
	would not be measurable by our devices today.
	Only like, was it like one tenth of a light year away?
	Like not even, basically tiny distance, because you really need like a targeted transmission
	of massive power directed somewhere for this to be picked up on long distances.
	And so I just think that our ability to measure is not amazing.
	I think there's probably other civilizations out there.
	And then the big question is why don't they build binomial probes and why don't they interstellar
	travel across the entire galaxy?
	And my current answer is it's probably interstellar travel is like really hard.
	You have the interstellar medium.
	If you want to move at close to speed of light, you're going to be encountering bullets along
	because even like tiny hydrogen atoms and little particles of dust are basically have
	like massive kinetic energy at those speeds.
	And so basically you need some kind of shielding.
	You need, you have all the cosmic radiation.
	It's just like brutal out there.
	It's really hard.
	And so my thinking is maybe interstellar travel is just extremely hard.
	And you have to go very slow.
	Like billions of years to build hard?
	It feels like, it feels like we're not a billion years away from doing that.
	It just might be that it's very, you have to go very slowly potentially as an example
	through space.
	Right.
	As opposed to close to the speed of light.
	So I'm suspicious basically of our ability to measure life and I'm suspicious of the
	ability to just permeate all of space in the galaxy or across galaxies.
	And that's the only way that I can currently see around it.
	It's kind of mind blowing to think that there's trillions of intelligent alien civilizations
	out there kind of slowly traveling through space to meet each other.
	And some of them meet, some of them go to war, some of them collaborate.
	Or they're all just independent.
	They're all just like little pockets.
	Well statistically, if there's like, if it's this trillions of them, surely some of them,
	some of the pockets are close enough to get some of them happen to be close enough to
	see each other.
	And once you see, once you see something that is definitely complex life, like if we see
	something, we're probably going to be severe, like intensely aggressively motivated to figure
	out what the hell that is and try to meet them.
	But what would be your first instinct to try to like at a generational level, meet them
	or defend against them?
	Or what would be your instinct as a president of the United States and a scientist?
	I don't know which hat you prefer in this question.
	Yeah, I think the question, it's really hard.
	I will say like, for example, for us, we have lots of primitive life forms on earth next
	to us.
	We have all kinds of ants and everything else and we share space with them.
	And we are hesitant to impact on them and to, we are, we're trying to protect them by
	default because they are amazing, interesting, dynamical systems that took a long time to
	evolve and they are interesting and special.
	And I don't know that you want to destroy that by default.
	And so I like complex dynamical systems that took a lot of time to evolve.
	I think I'd like to, I like to preserve it if I can afford to.
	And I'd like to think that the same would be true about the galactic resources and that
	they would think that we're kind of incredible, interesting story that took time.
	It took a few billion years to unravel and you don't want to just destroy it.
	I could see two aliens talking about earth right now and saying, I'm a big fan of complex
	dynamical systems.
	So I think it was a value to preserve these and who basically are a video game they watch
	or show a TV show that they watch.
	Yeah, I think you would need like a very good reason, I think, to destroy it.
	Like why don't we destroy these ant farms and so on?
	Because we're not actually like really in direct competition with them right now.
	We do it accidentally and so on, but there's plenty of resources.
	And so why would you destroy something that is so interesting and precious?
	Well from a scientific perspective, you might probe it.
	You might interact with it lightly.
	You might want to learn something from it, right?
	So I wonder there could be certain physical phenomena that we think is a physical phenomena,
	but it's actually interacting with us to like poke the finger and see what happens.
	I think it should be very interesting to scientists, other alien scientists, what happened here.
	And you know, it's a, what we're seeing today is a snapshot.
	Basically it's a result of a huge amount of computation over like billion years or something
	like that.
	So it could have been initiated by aliens.
	This could be a computer running a program.
	Like when, okay, if you had the power to do this, when you, okay, for sure, at least I
	would, I would pick an earth like planet that has the conditions based on my understanding
	of the chemistry prerequisites for life and I would see it with life and run it.
	Right?
	Like, wouldn't you 100% do that and observe it and then protect?
	I mean that that's not just a hell of a good TV show.
	It's a good scientific experiment.
	And that it is it's physical simulation, right?
	Evolution is the most like actually running it, uh, is the most efficient way to, uh,
	understand computation or to compute stuff or to understand life or, you know, what life
	looks like and what branches it can take.
	It doesn't make me kind of feel weird that we're part of a science experiment, but maybe
	it's everything's a science experiment.
	So does that change anything for us for a science experiment?
	Um, I don't know.
	Two descendants of apes talking about being inside of a science experiment.
	I'm suspicious of this idea of like a deliberate panspermia as you described it, sir.
	I don't see a divine intervention in some way in the, in the historical record right
	now.
	I do feel like, um, the story in these, in these books, like Nick Lane's books and so
	on sort of makes sense.
	Uh, and it makes sense how life arose on earth uniquely.
	And uh, yeah, I don't need a, I mean, I don't need to reach for more exotic explanations
	right now.
	Sure.
	But I think that inside of video game, don't, don't, don't observe any divine intervention
	either.
	And we might just be all NPCs running a kind of code.
	Maybe eventually they will.
	Currently NPCs are really dumb, but once they're running GPTs, um, maybe they will be like,
	Hey, this is really suspicious.
	What the hell?
	So you are famously tweeted.
	It looks like if you bombard earth with photons for a while, you can emit a roadster.
	So if like in hitchhiker's guide to the galaxy, we would summarize the story of earth.
	So in that book, it's mostly harmless.
	Uh, what do you think is all the possible stories, like a paragraph long or sentence
	long that earth could be summarized as once it's done, it's computation.
	So like all the possible full, if earth is a book, right?
	Uh, probably there has to be an ending.
	I mean, there's going to be an end to earth and it could end in all kinds of ways.
	It can end soon.
	It can end later.
	What do you think are the possible stories?
	Well, definitely there seems to be, yeah, you're sort of, it's pretty incredible that
	these self replicating systems will basically arise from the dynamics and then they perpetuate
	themselves and become more complex and eventually become conscious and build a society.
	And I kind of feel like in some sense, it's kind of like a deterministic wave, uh, that,
	you know, that kind of just like happens on any, you know, any sufficiently well-arranged
	system like earth.
	And so I kind of feel like there's a certain sense of inevitability in it.
	Um, and it's really beautiful.
	And it ends somehow, right?
	So it's a, it's a chemically a diverse environment where complex dynamical systems can evolve
	and become more, more further and further complex.
	But then there's a certain, um, what is it?
	There's certain terminating conditions.
	Yeah, I don't know what the terminating conditions are, but definitely there's a trend line of
	something and we're part of that story.
	And like, where does that, where does it go?
	So you know, we're famously described often as a biological bootloader for AIs and that's
	because humans, I mean, you know, we're an incredible, uh, biological system and we're
	and, uh, you know, and love and so on.
	Um, but we're extremely inefficient as well.
	Like we're talking to each other through audio.
	It's just kind of embarrassing, honestly, that we're manipulating like seven symbols,
	uh, serially, we're using vocal cords.
	It's all happening over like multiple seconds.
	It's just like kind of embarrassing when you step down to the frequencies at which computers
	operate or are able to cooperate on.
	So basically it does seem like, um, synthetic intelligences are kind of like the next stage
	of development.
	And um, I don't know where it leads to.
	Like at some point I suspect, uh, the universe is some kind of a puzzle and these, uh, synthetic
	AIs will uncover that puzzle and, um, solve it.
	And then what happens after, right?
	Like what, cause if you just like fast forward earth, many billions of years, it's like,
	it's quiet and then it's like to turmoil.
	You see like city lights and stuff like that.
	And then what happens at like, at the end, like, is it like a poof?
	It's it, or is it like a calming, is it explosion?
	Is it like earth like open, like a giant, cause you said, um, it roasters like, well,
	let's start emitting like, like a giant number of like satellites.
	Yes.
	It's some kind of a crazy explosion and we're living, we're like, we're stepping
	through a explosion and we're like living day to day and it doesn't look like it, but
	it's actually, if you, I saw a very cool animation of earth, uh, and life on earth and basically
	nothing happens for a long time.
	And then the last like two seconds, like basically cities and everything and just in the lower
	earth orbit just gets cluttered and just the whole thing happens in the last two seconds.
	And you're like, this is exploding.
	This is a state explosion.
	So if you play, yeah, yeah.
	If you play it at normal speed, it'll just look like an explosion.
	It's a firecracker.
	We're living in a firecracker.
	Where it's going to start emitting all kinds of interesting things.
	Yeah.
	And then the, so explosion doesn't, it might actually look like a little explosion with,
	with lights and fire and energy emitted, all that kind of stuff.
	But when you look inside the details of the explosion, there's actual complexity
	happening where there's like, uh, yeah, human life or some kind of life.
	We hope it's not a destructive firecracker.
	It's kind of like a constructive firecracker.
	All right.
	So given that, I think, uh, hilarious discussion.
	It is really interesting to think about like what the puzzle of the universe is.
	Did the creator of the universe, uh, give us a message?
	Like for example, in the book, contact, um, Carl Sagan, uh, there's a message for
	humanity, for any civilization in, uh, digits in the expansion of PI in base 11,
	eventually, which is kind of interesting thought, uh, maybe, maybe we're supposed
	to be giving a message to our creator.
	Maybe we're supposed to somehow create some kind of a quantum mechanical system
	that alerts them to our intelligent presence here.
	Cause if you think about it from their perspective, it's just say like quantum
	field theory, massive, like cellular, ton of a ton like thing.
	And like, how do you even notice that we exist?
	You might not even be able to pick us up in that simulation.
	And so how do you, uh, how do you prove that you exist, uh, that you're
	intelligent and that you're part of the universe?
	So this is like a touring test for intelligence from earth.
	Yeah.
	I got the creator's, uh, I mean, maybe this is like trying to complete
	the next word in a sentence.
	This is a complicated way of that.
	Like earth is just, is basically sending a message back.
	Yeah.
	The puzzle is basically like alerting the creator that we exist.
	Uh, or maybe the puzzle is just to just break out of the system and just, uh,
	you know, uh, stick it to the creator in some way.
	Uh, basically, like if you're playing a video game, you can, um, you can somehow
	find an exploit and find a way to execute on the host machine, uh, in the arbitrary
	code, uh, there's some, uh, for example, I believe someone got a Mario, a game of
	Mario to play pong just by, um, exploiting it and then, um, creating, uh,
	basically writing, writing code and being able to execute arbitrary code in the
	game.
	And so maybe we should be, maybe that's the puzzle is that we should be, um, uh,
	find a way to exploit it.
	So, so I think like some of these synthetic guys will eventually find the
	universe to be some kind of a puzzle and then solve it in some way.
	And that's kind of like the end game somehow.
	Do you often think about it as a, as a simulation?
	So, uh, as, or the universe being a kind of computation that has, might have bugs
	and exploits.
	Yes.
	Yeah, I think so.
	I said, well, physics is essentially, I think it's possible that physics has
	exploits and we should be trying to find them, uh, arranging some kind of a crazy
	quantum mechanical system that somehow gives you buffer overflow, uh, somehow
	gives you a rounding error in the floating point.
	Uh, uh, yeah, that's right.
	And we're like more and more sophisticated exploits.
	Those are jokes, but that could be actually very close.
	Yeah.
	We'll find some way to extract infinite energy.
	Uh, for example, when you train a reinforcement learning agents, um, in
	physical simulations and you ask them to say, run quickly on the flat ground,
	they'll end up doing all kinds of like weird things, um, in part of that
	optimization, right?
	They'll get on their back leg and they'll slide across the floor.
	And it's because the optimization, um, the enforcement learning optimization on
	that agent has figured out a way to extract infinite energy from the friction
	forces and, um, basically their poor implementation.
	And, uh, they found a way to generate infinite energy and just slide across the
	surface and it's not what you expected.
	It's just, uh, it's sort of like a perverse solution.
	And so maybe we can find something like that.
	Maybe we can be that little dog in this physical simulation.
	The cracks or escapes the intended consequences of the physics that the
	universe came up with will figure out some kind of shortcut to some weirdness.
	Yeah.
	And then, man, but see the problem with that weirdness is the first person to
	discover the weirdness, like sliding in the back legs.
	That's all we're going to do.
	Yeah.
	It's very quickly become everybody does that thing.
	So like the paperclip maximizer is a ridiculous idea, but that very well could
	be what then we'll just, uh, we'll just all switched that cause it's so fun.
	Well, no person will discover it.
	I think, by the way, I think it's going to have to be, uh, some kind of a super
	intelligent AGI of a third generation.
	Like we're building the first generation AGI.
	And then, you know, third generation.
	Yeah.
	So the, the bootloader for an AI, the, that AI will be a
	bootloader for another AI.
	Yeah.
	And then there's no way for us to introspect like what that might even, uh,
	I think it's very likely that these things, for example, like, say you have
	these AGI's it's very likely that, for example, they will be completely inert.
	I like these kinds of sci-fi books sometimes where these things are just
	completely inert, they don't interact with anything.
	And I find that kind of beautiful because, uh, they probably, uh, they've
	probably figured out the meta meta game of the universe in some way, potentially
	there, they're doing something completely beyond our imagination.
	Um, and, uh, they don't interact with simple chemical life forms.
	Like, why would you do that?
	So I find those kinds of ideas compelling.
	What's their source of fun?
	What are they, what are they doing?
	What's the source of pleasure solving in the universe, but in there.
	So can you define what it means inert?
	So they escape the interaction.
	As in, um, uh, they will behave in some very strange way to us, uh, because
	they're, uh, they're beyond, they're playing the meta game, uh, and the meta
	game is probably say like arranging quantum mechanical systems and some very
	weird ways to extract infinite energy, uh, solve the digital expansion of
	pie to whatever amount, uh, they will build their own like little fusion
	reactors or something crazy, like they're doing something beyond comprehension
	and not understandable to us and actually brilliant under the hood.
	What if quantum mechanics itself is the system and we're just thinking it's
	physics, but we're really parasites on, on, not parasite, we're not really
	hurting physics, we're just living on this organisms, this organism, and
	we're like trying to understand it, but really it is an organism and with
	a deep, deep intelligence, maybe physics itself is, uh, the, the, the organism
	that's doing the super interesting thing.
	And we're just like one little thing, yeah.
	And sitting on top of it, trying to get energy from it.
	We're just kind of like these particles in the wave that I feel like is mostly
	deterministic and takes a universe from some kind of a big bang to some kind
	of a super intelligent replicator, some kind of a stable point in the universe.
	Given these laws of physics, you don't think, uh, as Einstein said, God
	doesn't play dice, so you think it's mostly deterministic.
	There's no randomness in the thing.
	I think it's deterministic.
	Oh, there's tons of, uh, well, I'm, I'm, I want to be careful with randomness.
	Pseudo random.
	Yeah.
	I don't like random.
	Uh, I think maybe the laws of physics are deterministic.
	Um, yeah, I think they're deterministic.
	You just got really uncomfortable with this question.
	I just, do you have anxiety about whether the universe is random or not?
	Is this a, what's, there's no randomness.
	It's, uh, you said you like goodwill hunting.
	It's not your fault, Andre.
	It's not your fault, man.
	Um, so you don't like randomness.
	Uh, yeah, I think it's, uh, unsettling.
	I think it's a deterministic system.
	I think that things that look random, like say the, uh, collapse of the wave
	function, et cetera, I think they're actually deterministic, just entanglement,
	uh, and so on and, uh, some kind of a multiverse theory, something, something.
	Okay.
	So why does it feel like we have a free will?
	Like if I, if I raised his hand, I chose to do this now.
	Um, what that doesn't feel like a deterministic thing.
	It feels like I'm making a choice.
	It feels like it.
	Okay.
	So it's all feelings.
	It's just feelings.
	Yeah.
	So when RL agent is making a choice, is that, um, it's not really
	making a choice.
	The choice was all already there.
	Yeah.
	You're interpreting the choice and you're creating a narrative for, for having made it.
	Yeah.
	And now we're talking about the narrative.
	It's very meta looking back.
	What is the most beautiful or surprising idea in deep learning or AI in general
	that you've come across?
	You've seen this field explode, uh, and grow in interesting ways.
	Just what, what cool ideas like, like we made you sit back and go,
	small, big or small.
	Well, the one that I've been thinking about recently, the most probably is the,
	the transformer architecture.
	Um, so basically, uh, neural networks have, uh, a lot of architectures that were
	trendy have come and gone for different sensory modalities, like for vision,
	audio, text, you would process them with different looking neural nets.
	And recently we've seen these, this convergence towards one architecture,
	the transformer, and, uh, you can feed it video or you can feed it, you know,
	images or speech or text, and it just gobbles it up and it's kind of like
	a bit of a general purpose, uh, computer.
	There's also trainable and very efficient to run on our hardware.
	And so, uh, this paper came out in 2016.
	I want to say, um, attention is all you need.
	Attention is all you need.
	You criticize the paper title in retrospect that it wasn't, um, it didn't
	foresee the bigness of the impact that it was going to have.
	Yeah.
	I'm not sure if the authors were aware of the impact that that paper would go
	on to have, probably they weren't, but I think they were aware of some of the
	motivations and design decisions behind the transformer and they chose not to,
	I think, uh, expand on it in that way in the paper.
	And so I think they had an idea that there was more, um, than just the
	surface of just like, Oh, we're just doing translation and here's a better
	architecture.
	You're not just doing translation.
	This is like a really cool, differentiable, optimizable, efficient
	computer that you've proposed.
	And maybe they didn't have all of that foresight, but I think it's really
	interesting.
	Isn't it funny, sorry to interrupt that that title is memeable that they went
	for such a profound idea.
	They went with a, I don't think anyone used that kind of title before, right?
	Attention is all you need.
	Yeah.
	It's like a meme or something.
	Yeah.
	It's not funny that one, like, uh, maybe if it was a more serious title, it
	wouldn't have the impact.
	Honestly, I, yeah, there is an element of me that honestly agrees with you and
	prefers it this way.
	Yes.
	Uh, if it was too grand, it would over promise and then under deliver
	potentially.
	So you want to just, uh, meme your way to greatness.
	That should be a t-shirt.
	So you, you tweeted the transformer is a magnificent neural network architecture
	because it is a general purpose, differentiable computer.
	It is simultaneously expressive in the forward pass, optimizable via back
	propagation, gradient descent, and efficient high parallelism compute graph.
	Can you discuss some of those details, expressive, optimizable, efficient
	for memory or, or in general, whatever comes to your heart?
	You want to have a general purpose computer that you can train on arbitrary
	problems, uh, like say the task of next work prediction or detecting if there's
	a cat in a image or something like that.
	And you want to train this computer.
	So you want to set its, its weights.
	And I think there's a number of design criteria that sort of overlap in the
	transformer simultaneously that made it very successful.
	And I think the authors were kind of, uh, deliberately trying to, uh, make
	this really, uh, powerful architecture.
	And, um, so basically it's very powerful in the forward pass because it's able
	to express, um, very general computation as sort of something that looks like
	message passing, uh, you have nodes and they all store vectors and, uh, these
	nodes get to basically look at each other and it's, uh, each other's vectors
	and they get to communicate and basically nodes get to broadcast, Hey,
	I'm looking for certain things.
	And then other nodes get to broadcast.
	Hey, these are the things I have.
	Those are the keys and the values.
	So it's not just the tension.
	Yeah, exactly.
	Transformers much more than just the attention component.
	It's got many pieces architectural that went into it.
	The residual connection of the weights arranged, there's a multi-layer perceptron
	and they're the weights stacked and so on.
	Um, but basically there's a message passing scheme where nodes get to look at
	each other, decide what's interesting and then update each other.
	And, uh, so I think the, um, when you get to the details of it, I think
	it's a very expressive function.
	Uh, so it can express lots of different types of algorithms and forward pass.
	Not only that, but the way it's designed with the residual connections,
	layer normalizations, the soft max attention and everything.
	It's also optimizable.
	This is a really big deal because there's lots of computers that are
	powerful that you can't optimize.
	Um, or they're not easy to optimize using the techniques that we have,
	which is backpropagation and gradient and sent.
	These are first order methods, very simple optimizers really.
	And so, um, you also need it to be optimizable.
	Um, and then lastly, you want it to run efficiently in our hardware.
	Our hardware is a massive throughput machine, like GPUs.
	Uh, they prefer lots of parallelism.
	So you don't want to do lots of sequential operations.
	So you want to do a lot of operations serially and the transformer is designed
	with that in mind as well.
	And so it's designed for our hardware and it's designed to both be very
	expressive in a forward pass, but also very optimizable in the backward pass.
	And you said that, uh, the residual connections support a kind of ability
	to learn short algorithms fast and first, and then gradually extend them,
	uh, longer during training.
	What's, what's the idea of learning short algorithms?
	Right.
	Think of it as a, so basically a transformer is a, uh, series of, uh,
	blocks, right?
	And these blocks have attention and a little multilayer perceptual.
	And so you, you go off into a block and you come back to this residual pathway.
	And then you go off and you come back and then you have a number
	of layers arranged sequentially.
	And so the way to look at it, I think is, uh, because of the residual
	pathway in the backward pass, the gradients, uh, sort of flow along it uninterrupted
	because addition, uh, distributes the gradient equally to all of its branches.
	So the gradient from the supervision at the top, uh, just floats
	directly to the first layer.
	And the, all the residual connections are arranged so that in the beginning
	at during initialization, they contribute nothing to the residual pathway.
	Um, so what it kind of looks like is imagine the transformer is kind of
	like a, uh, Python, uh, function, like a death.
	And, um, you get to do various kinds of like lines of code.
	Uh, say you have a hundred layers, deep, uh, transformer, typically
	they would be much shorter, say 20.
	So if 20 lines of code, then you can do something in them.
	And so think of during the optimization, basically what it looks like is first
	you optimize the first line of code and then the second line of code can kick
	in and the third line of code can kick in.
	And I kind of feel like because of the residual pathway and the dynamics of
	the optimization, uh, you can sort of learn a very short algorithm that
	gets the approximate answer, but then the other layers can sort of kick in and
	start to create a contribution.
	And at the end of it, you're, you're optimizing over an algorithm
	that is a 20 lines of code.
	Except these lines of code are very complex because it's an
	entire block of a transformer.
	You can do a lot in there.
	Well, it's really interesting is that this transformer architecture
	actually has been a remarkably resilient.
	Basically the transformer that came out in 2016 is the transformer
	you would use today, except you reshuffle some of the layer norms.
	Uh, the layer normalizations have been reshuffled to a pre-norm, um, formulation.
	And so it's been remarkably stable, but there's a lot of bells and whistles
	that people have attached on it and try to, uh, improve it.
	I do think that basically it's a, it's a big step in simultaneously optimizing
	for lots of properties of a desirable neural network architecture.
	And I think people have been trying to change it, but it's proven
	remarkably resilient.
	Um, but I do think that there should be even better architectures potentially.
	But it's, uh, you're, you admire the resilience here.
	Yeah.
	There's something profound about this architecture that, that least
	resilient, so maybe we can, everything can be turned into a, uh, into a problem
	that transformers can solve.
	Currently definitely looks like the transformer is taking over AI and you
	can feed basically arbitrary problems into it.
	And it's a general, the French double computer and it's extremely powerful.
	And, uh, at this conversions in AI has been, uh, really interesting
	to watch, uh, for me personally.
	What else do you think could be discovered here about transformers?
	Like what's surprising thing or, or is it a stable, um, I want a stable place.
	Is there something interesting we might discover about transformers?
	Like aha moments maybe has to do with memory.
	Um, maybe knowledge representation, that kind of stuff.
	Definitely does that guys today is just pushing like basically right now, the
	side guys is do not touch the transformer, touch everything else.
	Yes.
	So people are scaling up the data sets, making them much, much bigger.
	They're working on the evaluation, making the evaluation much, much bigger.
	And, uh, um, they're basically keeping the architecture unchanged.
	And that's how we've, um, that's the last five years of progress in AI kind of.
	What do you think about one flavor of it, which is language models?
	Have you been surprised?
	Uh, has your sort of imagination been captivated by you mentioned
	GPT and all the bigger and bigger and bigger language models.
	And, uh, what are the limits of those models do you think?
	So just for the task of natural language.
	Basically the way GPT is trained, right.
	Is you just download a massive amount of text data from the internet and that you
	try to predict the next word in a sequence, roughly speaking, you're
	predicting little work chunks, but, uh, roughly speaking, that's it.
	Um, and what's been really interesting to watch is, uh, basically it's a language
	model, language models have actually existed for a very long time.
	Um, there's papers on language modeling from 2003, even earlier.
	Can you explain that case?
	What a language model is?
	Uh, yeah.
	So language model just, uh, basically the rough idea is, um, just predicting
	the next word in a sequence, roughly speaking.
	Uh, so there's a paper from, for example, Ben Geo, uh, and the team from 2003,
	where for the first time they were using a neural network to take, say like three
	or five words and predict the, um, next word, and they're doing this on much
	smaller datasets and the neural net is not a transformer, it's a multi-layer
	perceptron, but it's the first time that a neural network has been applied in
	that setting, but even before neural networks, there were, um, language models,
	except they were using, um, Ngram models.
	So Ngram models are just, uh, count based models.
	So, um, if you try to, if you start to take two words and predict the third
	one, you just count up how many times you've seen any, uh, two word combinations
	and what came next and what you predict as coming next is just what you've seen
	the most of in the training set.
	And so, uh, language modeling has been around for a long time.
	Neural networks have done language modeling for a long time.
	So really what's new or interesting or exciting is just realizing that when you
	scale it up, uh, with a powerful enough neural net, a transformer, you have all
	these emergent properties where, uh, basically what happens is if you have a
	large enough dataset of text, you are in the task of predicting the next word.
	You are multitasking a huge amount of different kinds of problems.
	You are multitasking, understanding of, you know, chemistry, physics, human
	nature, lots of things are sort of clustered in that objective.
	It's a very simple objective, but actually you have to understand
	a lot about the world to make that prediction.
	You just said the U word understanding, uh, are you in terms of chemistry and
	physics and so on, what do you feel like it's doing?
	Is it searching for the right context?
	Uh, in, in like, what is it, what is the actual process happening here?
	Yeah.
	So basically it gets a thousand words and it's trying to predict the thousand and
	first, and, uh, in order to do that very, very well over the entire dataset
	available on the internet, you actually have to basically kind of understand
	the context of, of what's going on in there.
	Yeah.
	Um, and, uh, it's a sufficiently hard problem that you, uh, if you have a
	powerful enough computer, like a transformer, you end up with a interesting
	solutions and, uh, you can ask it to do all kinds of things and, um, it, it
	shows a lot of, uh, emergent properties, like in context learning.
	That was the big deal with GPT and the original paper when they published it
	is that you can just sort of, uh, prompt it in various ways and ask it to do
	various things and it will just kind of complete the sentence, but in the process
	of just completing the sentence, it's actually solving all kinds of really,
	uh, interesting problems that we care about.
	Do you think it's doing something like understanding?
	Like when we use the word understanding for us humans, I think it's doing some
	understanding in its weights, it understands, I think a lot about the world
	and it has to, in order to predict the next word in the sequence.
	So it's trained on the data from the internet.
	Uh, what do you think about this, this approach in terms of data sets
	of using data from the internet?
	Do you think the internet has enough structured data to teach
	AI about human civilization?
	Yes.
	So I think the internet has a huge amount of data.
	I'm not sure if it's a complete enough set.
	I don't know that, uh, text is enough for having a sufficiently
	powerful AGI as an outcome.
	Um, of course there is audio and video and images and all that.
	Yeah.
	Kind of stuff.
	Yeah.
	So text by itself, I'm a little bit suspicious about.
	There's a ton of things we don't put in text in writing, uh, just
	because they're obvious to us about how the world works and the physics of it.
	And that things fall, we don't put that stuff in text because why would you,
	we share that understanding.
	And so text is a communication medium between humans and it's not a, uh, all
	encompassing medium of knowledge about the world, but as you pointed out,
	we do have video and we have images and we have audio.
	And so I think that, uh, that definitely helps a lot, but we haven't
	trained models, uh, sufficiently, uh, across both across all of those modalities yet.
	Uh, so I think that's what a lot of people are interested in.
	But I wonder what that shared understanding of like what we might call common
	sense has to be learned, inferred in order to complete the sentence correctly.
	So maybe the fact that it's implied on the internet, the model is going
	to have to learn that not by reading about it, by inferring it in the representation.
	So like common sense, just like we, I don't think we learn common sense.
	Like nobody says, tells us explicitly.
	We just figure it all out by interacting with the world.
	And so here's a model of reading about the way people interact with the world.
	It might have to infer that.
	I wonder, uh, you, you briefly worked on a project called the world of bits,
	training and RRL system to take actions on the internet, um, versus just consuming
	the internet, like we talked about.
	Do you think there's a future for that kind of system interacting with
	the internet to help the learning?
	Yes.
	I think that's probably the, uh, the final frontier for a lot of these
	models, uh, because, um, so as you mentioned, when I was at open AI, I was
	working on this project for a little bit.
	And basically it was the idea of giving neural networks access to a keyboard
	and a mouse and the idea possibly go wrong.
	So basically you, um, you perceive the input of the, uh, screen pixels.
	And, uh, basically the state of the computer is sort of visualized, uh, for
	human consumption in images of the web browser and stuff like that.
	And then you give them your own or the ability to press keyboards and use the
	mouse and we're trying to get it to, for example, complete bookings and, you
	know, interact with user interfaces.
	And, um,
	what'd you learn from that experience?
	Like, what was some fun stuff?
	This is a super cool idea.
	Yeah.
	I mean, it's like, uh, yeah, I mean, the, the step between observer to actor
	is a super fascinating step.
	Yeah.
	Well, it's the universal interface in the digital realm, I would say.
	And, uh, there's a universal interface in like the physical realm, which in my
	mind is a humanoid form factor kind of thing.
	Uh, we can later talk about optimists and so on, but I feel like there's a, uh,
	they're kind of like a similar philosophy in some way where the human, the world,
	the physical world is designed for the human form and the digital world is
	designed for the human form of seeing the screen and using keyword, not
	keyboard and mouse.
	And so it's the universal interface that can basically, uh, command the digital
	infrastructure we've built up for ourselves.
	And so it feels like a very powerful interface to, to command and to build on
	top of, uh, now to your question as to like what I learned from that, it's
	interesting because the world of bits was basically too early, I think at
	open AI at the time, um, this is around 2015 or so, and the zeitgeist at that
	time was very different in AI from the zeitgeist today at the time, everyone
	was super excited about reinforcement learning from scratch.
	Uh, this is the time of the Atari paper, uh, where, uh, neural networks were
	playing Atari games, um, and beating humans in some cases, uh, AlphaGo and so on.
	So everyone's very excited about training neural networks from scratch
	using reinforcement learning, um, directly.
	It turns out that reinforcement learning is extremely inefficient way of training
	neural networks because you're taking all these actions and all these
	observations and you get some sparse rewards once in a while.
	So you do all this stuff based on all these inputs and once in a while,
	you're like told you did a good thing, you did a bad thing.
	And it's just an extremely hard problem.
	You can't learn from that.
	Uh, you can burn a forest and you can sort of brute force through it.
	And we saw that I think with, uh, you know, with, uh, go and
	Dota and so on and does work.
	Uh, but it's extremely inefficient, uh, and, uh, not how you want to
	approach problems, uh, practically speaking.
	And so that's the approach that at the time we also took to world of bits.
	Uh, we would, uh, have an agent initialize randomly.
	So with keyboard mash and mouse mash and try to make a booking.
	And it's just like revealed the insanity of that approach very quickly,
	where you have to stumble by the correct booking in order to get a reward of
	you did it correctly and you're never going to stumble by it by chance at random.
	So even with a simple web interface, there's too many options.
	There's just too many options.
	Uh, and, uh, it's too sparse of a reward signal and you're
	starting from scratch at the time.
	And so you don't know how to read.
	You don't understand pictures, images, buttons.
	You don't understand what it means to like make a booking, but now what's
	happened is, uh, it is time to revisit that and open AI is interested in this.
	Uh, companies like adept are interested in this and so on.
	And, uh, the idea is coming back, uh, because the interface is very powerful,
	but now you're not training an agent from scratch.
	You are taking the GPT as an initialization.
	So GPT is pre-trained on all of.
	Text and it understands what's a booking.
	It understands what's a submit.
	It understands, um, quite a bit more.
	And so it already has those representations.
	They are very powerful.
	And that makes all the training significantly more efficient, um,
	and makes the problem tractable.
	Should the interaction be with like the way humans see it with the buttons and
	the language, or should be with the HTML, JavaScript and the CSS?
	What's, what do you think is the better?
	Uh, so today all of this interaction is mostly on the level of HTML, CSS,
	and so on that's done because of computational constraints.
	Uh, but I think ultimately, um, uh, everything is designed for human
	visual consumption and so at the end of the day, there's all the additional
	information is in the layout of the webpage and what's next to you and
	what's a red background and all this kind of stuff and what it looks like visually.
	So I think that's the final frontier as we are taking in a pixels and we're
	giving out keyboard mouse commands.
	Uh, but I think it's impractical still today.
	Do you worry about bots on the internet?
	Given, given these ideas, given how exciting they are, do you worry about
	bots on Twitter being not the stupid boss that we see now with the crypto
	bots, but the bots that might be out there actually that we don't see that
	they're interacting in interesting ways.
	So this kind of system feels like it should be able to pass the, I'm not a
	robot click button, whatever.
	Um, which does she understand how that test works?
	I don't quite like, uh, there's, there's a, there's a checkbox or
	whatever that you click is presumably tracking like mouse movement and
	the timing and so on.
	So exactly this kind of system we're talking about should be able to pass that.
	So w yeah, what do you feel about, um, bots that are language models plus have
	some interact ability and are able to tweet and reply and so on, do you worry
	about that world?
	Uh, yeah, I think it's always been a bit of an arms race, uh, between sort
	of the attack and the defense.
	Uh, so the attack will get stronger, but the defense will get stronger as well.
	Uh, our ability to detect that.
	How do you defend, how do you detect, how do you know that your Carpati
	account on Twitter is, is human?
	How would you approach that?
	Like if people were claimed, you know, uh, how would you defend yourself in
	the court of law that I'm a human?
	Um, this account is, yeah, at some point, I think, uh, it might be, I think
	the society, the society will evolve a little bit, like we might start signing
	digitally, signing, uh, some of our correspondence or, you know, things that
	we create, uh, right now it's not necessary, but maybe in the future it
	might be, I do think that we are going towards a world where we share, we
	share the digital space with, uh, AIs.
	Synthetic beings.
	Yeah.
	And, uh, they will get much better and they will share our digital realm and
	they'll eventually share our physical realm as well.
	It's much harder.
	Uh, but that's kind of like the world we're going towards and most of them
	will be benign and awful and some of them will be malicious and it's going to be
	an arms race trying to detect them.
	So, I mean, the worst isn't the AIs.
	The worst is the AIs pretending to be human.
	So mine, I don't know if it's always malicious.
	There's obviously a lot of malicious applications, but it could also be, you
	know, if I was an AI, I would try very hard to pretend to be human because we're
	in a human world.
	I wouldn't get any respect as an AI.
	I want to get some love and respect.
	I don't think the problem is intractable.
	People are, people are thinking about the proof of personhood and, uh, we
	might start digitally signing our stuff and we might all end up having like, uh,
	yeah, basically some, some solution for proof of personhood.
	It doesn't seem to me intractable.
	It's just something that we haven't had to do until now, but I think once the
	need like really starts to emerge, which is soon, I think people will think
	about it much more.
	So, but that too will be a race because, um, obviously you can probably, uh,
	spoof or fake the, the, the proof of, uh, personhood.
	So you have to try to figure out how to, I mean, it's weird that we have like
	social security numbers and like passports and stuff.
	It seems like it's harder to fake stuff in the physical space.
	In the digital space, it just feels like it's going to be very tricky, very
	tricky to out, um, cause it seems to be pretty low cost to fake stuff.
	What are you going to put an AI in jail for like trying to use a fake, uh,
	fake personhood proof?
	You can, I mean, okay, fine.
	You'll put a lot of AIs in jail, but there'll be more as arbitrary, like
	exponentially more the cost of creating a bot is very low.
	Unless there's some kind of way to track accurately, like you're not allowed to
	create any program without showing, uh, tying yourself to that program.
	Like you, any program that runs on the internet, you'll be able to, uh, trace
	every single human program and those involved with that program.
	Yeah, maybe you have to start declaring when, uh, you know, we have to start
	drawing those boundaries and keeping track of, okay, uh, what are digital
	entities versus human entities and, uh, what is the ownership of human entities
	and digital entities and, uh, something like that, um, I don't know, but I'm,
	I think I'm optimistic that this is, uh, this is, uh, possible and it's some, in
	some sense, we're currently in like the worst time of it because, um, all these
	bots suddenly have become very capable, uh, but we don't have the fences yet
	built up as a society and, but I think, uh, that doesn't seem to me intractable.
	It's just something that we have to deal with.
	It seems weird that the Twitter bot, like really crappy Twitter bots are so
	numerous, like is it, so I presume that the engineers at Twitter are very good.
	So it seems like what I would infer from that, uh, is it seems like a hard problem.
	It, they're probably catching, right.
	If I were to sort of steal man, the case, it's a hard problem and there's a
	huge cost to, uh, false positive to, to removing a post by somebody that's not a
	bot that creates a very bad user experience.
	So they're very cautious about removing.
	So maybe it's, um, and maybe the bots are really good at learning what gets
	removed and not such that they can stay ahead of the removal process very quickly.
	My impression of it honestly is, uh, there's a lot of loaning fruit.
	I mean, yeah, just that's what I, it's not subtle.
	My impression of it.
	It's not subtle, but you have to, yeah, that's my impression as well, but it
	feels like maybe you're seeing the, the tip of the iceberg, maybe the number of
	bots isn't like the trillions and you have to like, yeah, just, it's a
	constant assault of bots and you, yeah, I don't know, um, you have to steal man
	in the case, cause the bots I'm seeing are pretty like obvious.
	I could write a few lines of code that catch these spots.
	I mean, definitely there's a lot of loaning fruit, but I will say, I agree
	that if you are a sophisticated actor, you could probably create a pretty good
	bot right now, um, you know, using tools like GPTs, uh, because it's a language
	model, you can generate faces that look quite good now, uh, and you can do this
	at scale.
	And so I think, um, yeah, it's quite plausible and it's going to be hard to defend.
	There was a Google engineer that claimed that, uh, Lambda was sentient.
	Do you think there's any inkling of truth to what he felt?
	And more importantly, to me, at least, do you think language models will achieve
	sentience or the illusion of sentience soonish?
	Yeah, to me, it's a little bit of a canary in a coal mine kind of moment,
	honestly, a little bit, uh, because, uh, so this engineer spoke to like a chat
	bot at Google and, uh, became convinced that, uh, this bot is sentient.
	He asked us some existential philosophical questions and gave like
	reasonable answers and looked real and, uh, and so on.
	Uh, so to me, it's a, uh, he was, he was, uh, he wasn't sufficiently trying to
	stress the system, I think, and, uh, exposing the truth of it as it is today.
	Um, but, uh, I think this will be increasingly harder over time.
	Uh, so, uh, yeah, I think more and more people will basically, uh, become, um,
	yeah, I think more and more, there'll be more people like that over time.
	As, as this gets better, like form an emotional connection to an AI.
	Plausible in my mind.
	I think these AIs are actually quite good at human, human connection, human
	emotion, a ton of text on the internet is about humans and connection and love
	and so on, so I think they have a very good understanding in some, in some sense
	of, of how people speak to each other about this and, um, they're very capable
	of creating a lot of that kind of text.
	The, um, there's a lot of like sci-fi from fifties and sixties that imagined
	AIs in a very different way.
	They are calculating cold Vulcan like machines.
	That's not what we're getting today.
	We're getting pretty emotional AIs that actually, uh, are very competent and
	capable of generating, you know, plausible sounding text with respect to all of
	these topics.
	See, I'm really hopeful about AI systems that are like companions that help you
	grow, develop as a human being, uh, help you maximize long-term happiness.
	But I'm also very worried about AI systems that figure out from the
	internet, the humans get attracted to drama.
	And so these would just be like shit talking AIs.
	They just constantly, did you hear it?
	Like they'll do gossip.
	They'll do, uh, they'll try to plant seeds of suspicion to other humans that
	you love and trust and, uh, just kind of mess with people, uh, in the, you know,
	cause, cause that's going to get a lot of attention to drama, maximize drama on
	the path to maximizing, uh, engagement and us humans will feed into that machine
	and get, it'll be a giant drama shit storm.
	Uh, so I'm worried about that.
	So it's the objective function really defines the way that human civilization
	progresses with AIs in it.
	I think right now, at least today, they are not sort of, it's not correct to
	really think of them as goal seeking agents that want to do something.
	They have no long-term memory or anything.
	They it's literally a good approximation of it is you get a thousand words and
	you're trying to predict a thousand at first, and then you continue feeding it
	in and you are free to prompt it in whatever way you want.
	So in text, so you say, okay, you are a psychologist and you are very good
	and you love humans and here's a conversation between you and another human.
	Human colon, something you something, and then it just continues the pattern.
	And suddenly you're having a conversation with a fake psychologist
	who's like trying to help you.
	And so it's still kind of like in a realm of a tool is a, um, people can prompt
	it in arbitrary ways and it can create really incredible text, but it doesn't
	have long-term goals over long periods of time.
	It doesn't try to, uh, so it doesn't look that way right now.
	Yeah, but you can do short-term goals that have long-term effects.
	So if my prompting short-term goal is to get Andre Capati to respond to me on
	Twitter, whenever, like I think AI might that's the goal, but it might figure out
	that talking shit to you, it would be the best in a highly sophisticated, interesting
	way.
	And then you build up a relationship when you were spelling once and then it
	like over time it gets to not be sophisticated and just like just
	talk shit.
	And okay, maybe you won't get to Andre, but it might get to another
	celebrity, it might get into other big accounts and then it'll just, so with
	just that simple goal, get them to respond, maximize the probability of
	actual response.
	Yeah.
	I mean, you could prompt a powerful model like this with their, it's opinion
	about how to do any possible thing you're interested in.
	So they will just, they're kind of on track to become these oracles.
	I could sort of think of it that way.
	They are oracles.
	Currently it's just text, but they will have calculators.
	They will have access to Google search.
	They will have all kinds of gadgets and gizmos.
	They will be able to operate the internet and find different information.
	And yeah, in some sense, that's kind of like currently what it looks like in
	terms of the development.
	Do you think it'll be an improvement eventually over what Google is for access
	to human knowledge?
	Like it'll be a more effective search engine to access human knowledge.
	I think there's definite scope in building a better search engine today.
	And I think Google, they have all the tools, all the people, they have
	everything they need, they have all the puzzle pieces, they have people training
	transformers at scale, they have all the data.
	It's just not obvious if they are capable as an organization to innovate on their
	search engine right now.
	And if they don't, someone else will.
	There's absolute scope for building a significantly better search engine
	built on these tools.
	It's so interesting.
	A large company where the search, there's already an infrastructure.
	It works as brings out a lot of money.
	So where structurally inside a company is their motivation to pivot?
	To say, we're going to build a new search engine.
	Yeah, that's hard.
	So it's usually going to come from a startup, right?
	That's that would be, yeah.
	Or some other more competent organization.
	So I don't know.
	So currently, for example, maybe Bing has another shot at it.
	You know, as an example.
	Microsoft Edge, we're talking offline.
	I mean, it definitely is really interesting because search engines used to be about,
	OK, here's some query.
	Here's here's here's web pages that look like the stuff that you have.
	But you could just directly go to answer and then have supporting evidence.
	And these these models, basically, they've read all the text and they've read all the
	web pages.
	And so sometimes when you see yourself going over to search results and sort of getting
	like a sense of like the average answer to whatever you're interested in, like that just
	directly comes out.
	You don't have to do that work.
	So they're kind of like.
	Yeah, I think they have a way to this of distilling all that knowledge into.
	Like some level of insight, basically.
	Do you think of prompting as a kind of teaching and learning like this whole process,
	like another layer?
	You know, because maybe that's what humans are.
	We already have that background model and you're the world is prompting you.
	Yeah, exactly.
	I think the way we are programming these models is that we're trying to make it
	like computers now like GPT's is converging to how you program humans.
	I mean, how do I program humans via prompt?
	I go to people and I prompt them to do things.
	I prompt them from information.
	And so natural language prompt is how we program humans.
	And we're starting to program computers directly in that interface.
	It's like pretty remarkable, honestly.
	So you've spoken a lot about the idea of software 2.0.
	All good ideas become like cliches so quickly, like the terms.
	It's kind of hilarious.
	It's like I think Eminem once said that like if he gets annoyed by a song he's written
	very quickly, that means it's going to be a big hit because it's too catchy.
	But can you describe this idea and how you're thinking about it has evolved over the
	months and years since since you coined it?
	Yeah.
	Yes, I had a blog post on software 2.0, I think several years ago now.
	And the reason I wrote that post is because I kept I kind of saw something remarkable
	happening in like software development and how a lot of code was being transitioned to
	be written not in sort of like C++ and so on, but it's written in the weights of a
	neural net, basically just saying that neural nets are taking over software, the realm of
	software and taking more and more tasks.
	And at the time, I think not many people understood this deeply enough that this is a big
	deal. It's a big transition.
	Neural networks were seen as one of multiple classification algorithms you might use for
	your data set problem on Kaggle.
	Like this is not that this is a change in how we program computers.
	And I saw neural nets as this is going to take over.
	The way we program computers is going to change.
	It's not going to be people writing software in C++ or something like that and directly
	programming the software. It's going to be accumulating training sets and data sets and
	crafting these objectives by which you train these neural nets.
	And at some point, there's going to be a compilation process from the data sets and the
	objective and the architecture specification into the binary, which is really just the
	neural net weights and the forward pass of the neural net.
	And then you can deploy that binary.
	And so I was talking about that sort of transition and that's what the post is about.
	And I saw this sort of play out in a lot of fields, autopilot being one of them, but
	also just simple image classification.
	People thought originally, you know, in the 80s and so on that they would write the
	algorithm for detecting a dog in an image.
	And they had all these ideas about how the brain does it.
	And first we detect corners and then we detect lines and then we stitch them up.
	And they were like really going at it.
	They were like thinking about how they're going to write the algorithm.
	And this is not the way you build it.
	And there was a smooth transition where, OK, first we thought we were going to build
	everything. Then we were building the features.
	So like hog features and things like that that detect these little statistical patterns
	from image patches. And then there was a little bit of learning on top of it, like a
	support vector machine or binary classifier for cat versus dog and images on top of the
	features. So we wrote the features, but we trained the last layer, sort of the
	classifier. And then people are like, actually, let's not even design the features
	because we can't. Honestly, we're not very good at it.
	So let's also learn the features.
	And then you end up with basically a convolutional neural net where you're learning
	most of it. You're just specifying the architecture and the architecture has tons of
	fill in the blanks, which is all the knobs, and you let the optimization write most of
	it. And so this transition is happening across the industry everywhere.
	And suddenly we end up with a ton of code that is written in neural net weights.
	And I was just pointing out that the analogy is actually pretty strong.
	And we have a lot of developer environments for software 1.0, like we have IDEs, how
	you work with code, how you debug code, how you run code, how do you maintain code?
	We have GitHub. So I was trying to make those analogies in the new realm.
	Like, what is the GitHub of software 2.0?
	Turns out it's something that looks like hugging face right now.
	You know, and so I think some people took it seriously and built cool companies.
	And many people originally attacked the post.
	It actually was not well received when I wrote it.
	And I think maybe it has something to do with the title, but the post was not well
	received. And I think more people sort of have been coming around to it over time.
	Yeah. So you were the director of AI at Tesla where I think this idea was really
	implemented at scale, which is how you have engineering teams doing software 2.0.
	So can you sort of linger on that idea of, I think we're in the really early stages
	of everything you just said, which is like GitHub IDEs.
	Like how do we build engineering teams that that work in software 2.0 systems and
	the data collection and the data annotation, which is all part of that
	software 2.0. Like, what do you think is the task of programming in software 2.0?
	Is it debugging in the space of hyperparameters or is it also debugging in
	the space of data?
	Yeah. The way by which you program the computer and influence its algorithm is
	not by writing the commands yourself.
	You're changing mostly the data set.
	You're changing the loss functions of like what the neural net is trying to do, how
	it's trying to predict things. But basically the data sets and the architecture of
	the neural net. And so in the case of the autopilot, a lot of the data sets have to
	do with, for example, detection of objects and lane line markings and traffic lights
	and so on. So you accumulate massive data sets of here's an example, here's the
	desired label, and then here's roughly how the architect, here's roughly what the
	algorithm should look like. And that's a convolutional neural net.
	So the specification of the architecture is like a hint as to what the algorithm
	should roughly look like. And then the fill in the blanks process of optimization is
	the training process. And then you take your neural net that was trained, it gives
	all the right answers on your data set and you deploy it.
	So there is in that case, perhaps at all machine learning cases, there's a lot of
	tasks. So is coming up, formulating a task like for a multi-headed neural network is
	formulating a task part of the programming? Yeah, very much so. How you break down a
	problem into a set of tasks. Yeah. I'm on a high level, I would say, if you look at
	the software running in the autopilot, I gave a number of talks on this topic. I
	would say originally a lot of it was written in software 1.0. There's imagine lots of C++,
	right? And then gradually there was a tiny neural net that was, for example, predicting, given a
	single image, is there like a traffic light or not? Or is there a landline marking or not?
	And this neural net didn't have too much to do in the scope of the software. It was making tiny
	predictions on individual little image. And then the rest of the system stitched it up. So, okay,
	we're actually, we don't have just a single camera, we have eight cameras. We actually have eight
	cameras over time. And so what do you do with these predictions? How do you put them together?
	How do you do the fusion of all that information? And how do you act on it? All of that was written
	by humans in C++. And then we decided, okay, we don't actually want to do all of that fusion
	in C++ code because we're actually not good enough to write that algorithm. We want the neural nets
	to write the algorithm and we want to port all of that software into the 2.0 stack. And so then we
	actually had neural nets that now take all the eight camera images simultaneously and make
	predictions for all of that. And actually they don't make predictions in the space of images,
	they now make predictions directly in 3D. And actually they don't in three dimensions around
	the car. And now actually we don't manually fuse the predictions in 3D over time. We don't trust
	ourselves to write that tracker. So actually we give the neural net the information over time.
	So it takes these videos now and makes those predictions. And so you're sort of just like
	putting more and more power into the neural net, more processing. And at the end of it, the
	eventual goal is to have most of the software potentially be in the 2.0 land because it works
	significantly better. Humans are just not very good at writing software basically.
	So the prediction is happening in this 4D land with three dimensional world over time. How do you
	do annotation in that world? So data annotation, whether it's self-supervised or manual by humans
	is a big part of the software 2.0 world. Right. I would say by far in the industry,
	if you're talking about the industry and what is the technology of what we have available,
	everything is supervised learning. So you need data sets of input, desired output,
	and you need lots of it. And there are three properties of it that you need. You need it to
	be very large, you need it to be accurate, no mistakes, and you need it to be diverse.
	You don't want to just have a lot of correct examples of one thing. You need to really cover
	the space of possibility as much as you can. And the more you can cover the space of possible inputs,
	the better the algorithm will work at the end. Now, once you have really good data sets that you're
	collecting, curating, and cleaning, you can train your neural net on top of that. So a lot of the
	work goes into cleaning those data sets. Now, as you pointed out, it could be the question is,
	how do you achieve a ton of... If you want to basically predict in 3D, you need data in 3D
	to back that up. So in this video, we have eight videos coming from all the cameras of the system.
	And this is what they saw. And this is the truth of what actually was around. There was this car,
	there was this car, this car. These are the lane line markings. This is the geometry of the road.
	There was traffic light in this three-dimensional position. You need the ground truth. And so the
	big question that the team was solving, of course, is how do you arrive at that ground truth? Because
	once you have a million of it, and it's large, clean, and diverse, then training a neural net
	on it works extremely well. And you can ship that into the car. And so there's many mechanisms by
	which we collected that training data. You can always go for human annotation. You can go for
	simulation as a source of ground truth. You can also go for what we call the offline tracker
	that we've spoken about at the AI day and so on, which is basically an automatic reconstruction
	process for taking those videos and recovering the three-dimensional reality of what was around
	that car. So basically think of doing a three-dimensional reconstruction as an
	offline thing, and then understanding that, okay, there's 10 seconds of video. This is what we saw.
	And therefore, here's all the lane lines, cars, and so on. And then once you have that annotation,
	you can train your neural net to imitate it. And how difficult is the three-D reconstruction?
	It's difficult, but it can be done. So there's overlap between the cameras
	and you do the reconstruction. And there's perhaps if there's any inaccuracy,
	so that's caught in the annotation step. Yes. The nice thing about the annotation is that it is
	fully offline. You have infinite time. You have a chunk of one minute and you're trying to just
	offline in a supercomputer somewhere, figure out where were the positions of all the cars,
	all the people, and you have your full one minute of video from all the angles.
	And you can run all the neural nets you want, and they can be very efficient, massive neural nets.
	There can be neural nets that can't even run in the car later at test time. So they can be even
	more powerful neural nets than what you can eventually deploy. So you can do anything you
	want, three-dimensional reconstruction, neural nets, anything you want just to recover that truth,
	and then you supervise that truth. What have you learned? You said no mistakes about humans
	doing annotation because I assume humans are... There's like a range of things they're good at
	in terms of clicking stuff on screen. Isn't that... How interesting is that to you of a problem of
	designing an annotator where humans are accurate, enjoy it? What are even the metrics? Are efficient
	or productive, all that kind of stuff? Yeah. So I grew the annotation team at
	Tesla from basically zero to a thousand while I was there. That was really interesting. My background
	is a PhD student researcher, so growing that kind of an organization was pretty crazy.
	But yeah, I think it's extremely interesting and part of the design process very much behind the
	autopilot as to where you use humans. Humans are very good at certain kinds of annotations.
	They're very good, for example, at two-dimensional annotations of images. They're not good at
	annotating cars over time in three-dimensional space, very, very hard. And so that's why we're
	very careful to design the tasks that are easy to do for humans versus things that should be left to
	the offline tracker. Like maybe the computer will do all the triangulation and 3D reconstruction,
	but the human will say exactly these pixels of the image are a car, exactly these pixels are human.
	And so co-designing the data annotation pipeline was very much
	bread and butter, was what I was doing daily. Do you think there's still a lot of open problems
	in that space? Just in general, annotation where the stuff the machines are good at,
	machines do and the humans do what they're good at, and there's maybe some iterative process.
	Right. I think to a very large extent, we went through a number of iterations and we learned a
	ton about how to create these data sets. I'm not seeing big open problems. Originally when I joined,
	I was really not sure how this would turn out. But by the time I left, I was much more secure and
	understand the philosophy of how to create these data sets. And I was pretty comfortable with
	where that was at the time. So what are strengths and limitations of cameras for the driving task
	in your understanding when you formulate the driving task as a vision task with eight cameras?
	You've seen that the entire, most of the history of the computer vision field,
	when it has to do with neural networks, just if you step back, what are the strengths and limitations
	of pixels, of using pixels to drive? Yeah. Pixels I think are a beautiful sensor,
	beautiful sensor, I would say. The thing is like cameras are very, very cheap and they provide a
	ton of information, ton of bits. Also it's extremely cheap sensor for a ton of bits. And each one of
	these bits is a constraint on the state of the world. And so you get lots of megapixel images,
	very cheap. And it just gives you all these constraints for understanding what's actually
	out there in the world. So vision is probably the highest bandwidth sensor. It's a very high
	bandwidth sensor. I love that pixels is a constraint on the world. It's this highly complex,
	high bandwidth constraint on the state of the world. And it's not just that, but again, this
	real importance of it's the sensor that humans use. Therefore, everything is designed for that
	sensor. The text, the writing, the flashing signs, everything is designed for vision. And so
	you just find it everywhere. And so that's why that is the interface you want to be in,
	talking again about these universal interfaces. And that's where we actually want to measure the
	world as well and then develop software for that sensor. But there's other constraints on the state
	of the world that humans use to understand the world. I mean, vision ultimately is the main one,
	but we're referencing our understanding of human behavior and some common sense physics
	that could be inferred from vision from a perception perspective. But it feels like
	we're using some kind of reasoning to predict the world, not just the pixels.
	I mean, you have a powerful prior service for how the world evolves over time, et cetera. So it's
	not just about the likelihood term coming up from the data itself telling you about what you are
	observing, but also the prior term of where are the likely things to see and how do they likely
	move and so on. And the question is how complex is the range of possibilities that might happen
	in the driving task? Is that to you still an open problem of how difficult is driving,
	like philosophically speaking? All the time you worked on driving, do you understand how
	hard driving is? Yeah, driving is really hard because it has to do with the predictions of
	all these other agents and the theory of mind and what they're going to do and are they looking
	at you? Where are they looking? Where are they thinking? There's a lot that goes there at the
	full tail of the expansion of the knives that we have to be comfortable with eventually.
	The final problems are of that form. I don't think those are the problems that are very common.
	I think eventually they're important, but it's really in the tail end.
	In the tail end, the rare edge cases. From the vision perspective, what are the toughest parts
	of the vision problem of driving? Well, basically the sensor is extremely powerful,
	but you still need to process that information. And so going from brightnesses of these special
	values to, hey, here are the three-dimensional world is extremely hard. And that's what the
	neural networks are fundamentally doing. And so the difficulty really is in just doing an extremely
	good job of engineering the entire pipeline, the entire data engine, having the capacity to train
	these neural nets, having the ability to evaluate the system and iterate on it. So I would say just
	doing this in production at scale is like the hard part. It's an execution problem.
	So the data engine, but also the deployment of the system such that it has low latency performance.
	So it has to do all these steps. Yeah, for the neural net specifically,
	just making sure everything fits into the chip on the car. And you have a finite budget of flops
	that you can perform and memory bandwidth and other constraints. And you have to make sure it
	flies and you can squeeze in as much computer as you can into the tiny. What have you learned from
	that process? Because maybe that's one of the bigger, like new things coming from a research
	background where there's a system that has to run under heavily constrained resources,
	has to run really fast. What kind of insights have you learned from that?
	Yeah, I'm not sure if there's too many insights. You're trying to create a neural net that will
	fit in what you have available and you're always trying to optimize it. And we talked a lot about
	it on the AI day and basically the triple backflips that the team is doing to make sure it all fits
	and utilizes the engine. So I think it's extremely good engineering. And then there's all kinds of
	little insights peppered in on how to do it properly. Let's actually zoom out because I
	don't think we talked about the data engine, the entirety of the layouts of this idea that I think
	is just beautiful with humans in the loop. Can you describe the data engine? Yeah, the data engine is
	what I call the almost biological feeling like process by which you perfect the training sets
	for these neural networks. So because most of the programming now is in the level of these data sets
	and make sure they're large, diverse and clean. Basically, you have a data set that you think is
	good. You train your neural net, you deploy it, and then you observe how well it's performing.
	And you're trying to always increase the quality of your data set. So you're trying to catch
	scenarios basically that are basically rare. And it is in these scenarios that the neural nets
	will typically struggle in because they weren't told what to do in those rare cases in the data
	set. But now you can close the loop because if you can now collect all those at scale, you can then
	feed them back into the reconstruction process I described and reconstruct the truth in those cases
	and add it to the data set. And so the whole thing ends up being like a staircase of improvement
	of perfecting your training set. And you have to go through deployments so that you can mine
	the parts that are not yet represented well in the data set. So your data set is basically imperfect.
	It needs to be diverse. It has pockets that are missing and you need to pad out the pockets. You
	can sort of think of it that way in the data. What role do humans play in this? So what's this
	biological system? Like are human bodies made up of cells? What role, like how do you optimize the
	human system? The multiple engineers collaborating, figuring out what to focus on, what to contribute,
	which task to optimize in this neural network. Who is in charge of figuring out which task needs
	more data? Can you speak to the hyperparameters of the human system? It really just comes down
	to extremely good execution from an engineering team who knows what they're doing. They understand
	intuitively the philosophical insights underlying the data engine and the process by which the
	system improves and how to again, delegate the strategy of the data collection and how that
	works and then just making sure it's all extremely well executed. And that's where most of the work
	is not even the philosophizing or the research or the ideas of it. It's just extremely good
	execution. It's so hard when you're dealing with data at that scale. So your role in the data engine
	executing well on it is difficult and extremely important. Is there a priority of like a vision
	board of saying like, we really need to get better at stoplights? Yeah. Like the prioritization of
	tasks. Is that essentially, and that comes from the data? That comes to a very large extent to
	what we are trying to achieve in the product for a map or the release we're trying to get out
	in the feedback from the QA team where the system is struggling or not, the things we're
	trying to improve. And the QA team gives some signal, some information in aggregate about the
	performance of the system in various conditions. That's right. And then of course, all of us drive
	it and we can also see it. It's really nice to work with a system that you can also experience
	yourself and it drives you home. Is there some insight you can draw from your individual
	experience that you just can't quite get from an aggregate statistical analysis of data? Yeah.
	It's so weird, right? Yes. It's not scientific in a sense because you're just one anecdotal sample.
	Yeah. I think there's a ton of, it's a source of truth. It's your interaction with the system
	and you can see it, you can play with it, you can perturb it, you can get a sense of it,
	you have an intuition for it. I think numbers just like have a way of, numbers and plots and graphs
	are much harder. It hides a lot of- It's like if you train a language model,
	it's a really powerful way is by you interacting with it. Yeah, 100%.
	Try to build up an intuition. Yeah. I think like Ilan also, he always wanted to drive the system
	himself. He drives a lot and I want to say almost daily. So he also sees this as a source of truth,
	you driving the system and it performing and yeah.
	So what do you think? Tough questions here. So Tesla last year removed radar from
	the sensor suite and now just announced that it's going to remove ultrasonic sensors
	relying solely on vision, so camera only. Does that make the perception problem harder or easier?
	I would almost reframe the question in some way. So the thing is basically,
	you would think that additional sensors- By the way, can I just interrupt?
	Go ahead. I wonder if a language model will ever do that if you prompt it. Let me reframe your
	question. That would be epic. That's the wrong prompt. Sorry. It's like a little bit of a wrong
	question because basically you would think that these sensors are an asset to you. Yeah. But if
	you fully consider the entire product in its entirety, these sensors are actually potentially
	liability because these sensors aren't free. They don't just appear on your car. You need
	suddenly you need to have an entire supply chain. You have people procuring it. There can be
	problems with them. They may need replacement. They are part of the manufacturing process. They
	can hold back the line in production. You need to source them. You need to maintain them. You have
	to have teams that write the firmware, all of it. And then you also have to incorporate them,
	fuse them into the system in some way. And so it actually like bloats a lot of it. And I think
	Elon is really good at simplify, simplify. Best part is no part. And he always tries to throw away
	things that are not essential because he understands the entropy in organizations and in the approach.
	And I think in this case, the cost is high and you're not potentially seeing it if you're just a
	computer vision engineer. And I'm just trying to improve my network and is it more useful or less
	useful? How useful is it? And the thing is once you consider the full cost of a sensor, it actually
	is potentially a liability. And you need to be really sure that it's giving you extremely useful
	information. In this case, we looked at using it or not using it and the Delta was not massive.
	And so it's not useful. Is it also bloat in the data engine? Like having more sensors? Is it
	distraction? And these sensors, you know, they can change over time. For example, you can have one
	type of say radar, you can have other type of radar. They change over time. Now you suddenly
	need to worry about it. Now suddenly you have a column in your SQLite telling you, oh, what
	sensor type was it? And they all have different distributions. And then they can, they just,
	they contribute noise and entropy into everything. And they bloat stuff. And also organizationally
	has been really fascinating to me that it can be very distracting. If you, if all, if you only
	want to get to work is vision, all the resources are on it and you're building out a data engine
	and you're actually making forward progress because that is the sensor with the most bandwidth,
	the most constraints in the world. And you're investing fully into that. And you can make that
	extremely good. If you're, you're only a finite amount of sort of spend of focus across different
	facets of the system. And this kind of reminds me of Rich Sutton's, the bitter lesson.
	It just seems like simplifying the system. Yeah. In the long run. And of course, you don't know
	what the long run is. It seems to be always the right solution. Yeah. Yes. In that case, it was
	for RL, but it seems to apply generally across all systems that do computation. Yeah. So where,
	what do you think about the lidar as a crutch debate? The battle between point clouds and pixels.
	Yeah. I think this debate is always like slightly confusing to me because it seems like the actual
	debate should be about like, do you have the fleet or not? That's like the really important
	thing about whether you can achieve a really good functioning of an AI system at this scale. So data
	collection systems. Yeah. Do you have a fleet or not is significantly more important, whether you
	have lidar or not. It's just another sensor. And yeah, I think similar to the radar discussion,
	basically, I don't think it basically doesn't offer extra information. It's extremely costly.
	It has all kinds of problems. You have to worry about it. You have to calibrate it,
	et cetera. It creates bloat and entropy. You have to be really sure that you need this sensor.
	In this case, I basically don't think you need it. And I think honestly, I will make a stronger
	statement. I think the others, some of the other companies that are using it are probably going
	to drop it. Yeah. So you have to consider the sensor in the full, in considering, can you build
	a big fleet that collects a lot of data? And can you integrate that sensor with that data and that
	sensor into a data engine that's able to quickly find different parts of the data that then
	continuously improves whatever the model that you're using? Yeah. Another way to look at it is like
	vision is necessary in the sense that the world is designed for human visual consumption. So you
	need vision. It's necessary. And then also it is sufficient because it has all the information that
	you need for driving and humans obviously has vision to drive. So it's both necessary and
	sufficient. So you want to focus resources and you have to be really sure if you're going to
	bring in other sensors. You could add sensors to infinity. At some point, you need to draw the line.
	And I think in this case, you have to really consider the full cost of any one sensor.
	That you're adopting and do you really need it? And I think the answer in this case is no.
	So what do you think about the idea that the other companies are forming high resolution maps
	and constraining heavily the geographic regions in which they operate? Is that approach not in your
	view, not going to scale over time to the entirety of the United States? I think as you mentioned,
	they pre-map all the environments and they need to refresh the map. And they have a perfect
	centimeter level accuracy map of everywhere they're going to drive. It's crazy. We've been
	talking about the autonomy actually changing the world. We're talking about the deployment
	on a global scale of autonomous systems for transportation. And if you need to maintain
	a centimeter accurate map for Earth or for many cities and keep them updated, it's a huge
	dependency that you're taking on. Huge dependency. It's a massive, massive dependency. And now you
	need to ask yourself, do you really need it? And humans don't need it. So it's very useful to have
	a low level map of like, okay, the connectivity of your road. You know that there's a fork coming up.
	When you drive an environment, you have that high level understanding. It's like a small Google map
	and Tesla uses Google map, similar kind of resolution information in the system, but it
	will not pre-map environments to send me a level of accuracy. It's a crutch. It's a distraction.
	It costs entropy and it diffuses the team. It dilutes the team. And you're not focusing
	on what's actually necessary, which is the computer vision problem. What did you learn
	about machine learning, about engineering, about life, about yourself as one human being
	from working with Elon Musk? I think the most I've learned is about how to sort of run organizations
	efficiently and how to create efficient organizations and how to fight entropy in an organization.
	So human engineering in the fight against entropy. Yeah. I think Elon is a very efficient warrior
	in the fight against entropy in organizations. What does entropy in an organization look like?
	It's process. It's process and inefficiencies in the form of meetings and that kind of stuff.
	Yeah. Meetings. He hates meetings. He keeps telling people to skip meetings if they're not useful.
	He basically runs the world's biggest startups, I would say. Tesla, SpaceX are the world's biggest
	startups. Tesla actually has multiple startups. I think it's better to look at it that way.
	And so I think he's extremely good at that. And yeah, he has a very good intuition for
	streamlining processes, making everything efficient. Best part is no part, simplifying, focusing,
	and just kind of removing barriers, moving very quickly, making big moves.
	All of this is very startupy sort of seeming things, but at scale.
	So strong drive to simplify. From your perspective, I mean, that also probably applies to just
	designing systems and machine learning and otherwise. Like simplify, simplify.
	Yes. What do you think is the secret to maintaining the startup culture in a company that grows?
	Can you introspect that?
	I do think you need someone in a powerful position with a big hammer like Elon, who's like
	the cheerleader for that idea and ruthlessly pursues it. If no one has a big enough hammer,
	everything turns into committees, democracy within the company, process, talking to stakeholders,
	decision making, just everything just crumbles. If you have a big person who's also really smart
	and has a big hammer, things move quickly. So you said your favorite scene in Interstellar
	is the intense docking scene with the AI and Cooper talking, saying,
	Cooper, what are you doing docking? It's not possible. No, it's necessary. Such a good line.
	By the way, just so many questions there. Why an AI in that scene, presumably is supposed to be
	able to compute a lot more than the human. It's saying it's not optimal. Why the human? I mean,
	that's a movie, but shouldn't the AI know much better than the human? Anyway, what do you think
	is the value of setting seemingly impossible goals? Our initial intuition, which seems like
	something that you have taken on that Elon espouses, where the initial intuition of the
	community might say this is very difficult and then you take it on anyway with a crazy deadline.
	You just from a human engineering perspective, have you seen the value of that?
	I wouldn't say that setting impossible goals exactly is a good idea, but I think setting very
	ambitious goals is a good idea. I think there's what I call sub-linear scaling of difficulty,
	which means that 10x problems are not 10x hard. Usually 10x harder problem is like 2 or 3x harder
	to execute on. If you want to improve a system by 10%, it costs some amount of work. If you want to
	10x improve the system, it doesn't cost 100x amount of work. It's because you fundamentally
	change the approach. If you start with that constraint, then some approaches are obviously
	dumb and not going to work. It forces you to reevaluate. I think it's a very interesting way
	of approaching problem solving. It requires a weird kind of thinking. Going back to your PhD
	days, how do you think which ideas in the machine learning community are solvable?
	Yes.
	It requires, what is that? There's the cliche of first principles thinking, but it requires
	to basically ignore what the community is saying because doesn't a community in science usually
	draw lines of what is and isn't possible? It's very hard to break out of that without going crazy.
	I think a good example here is the deep learning revolution in some sense because you could
	be in computer vision at that time during the deep learning revolution of 2012 and so on.
	You could be improving a computer vision stack by 10% or you can just be saying,
	actually all of this is useless. How do I do 10x better computer vision? Well, it's not probably
	by tuning a hog feature detector. I need a different approach. I need something that is
	scalable. Going back to Richard Sutton's understanding the philosophy of the bitter lesson
	and then being like, actually I need much more scalable system like a neural network
	that in principle works and then having some deep believers that can actually
	execute on that mission and make it work. That's the 10x solution.
	What do you think is the timeline to solve the problem of autonomous driving?
	That's still in part an open question.
	Yeah. I think the tough thing with timelines of self-driving obviously is that no one has created
	self-driving. It's not like, what do you think is the timeline to build this bridge? Well,
	we've built million bridges before. Here's how long that takes. No one has built autonomy. It's
	not obvious. Some parts turn out to be much easier than others. It's really hard to forecast. You do
	your best based on trend lines and so on and based on intuition, but that's why fundamentally it's
	just really hard to forecast this. Even still being inside of it, it's hard to do. Yes. Some
	things turn out to be much harder and some things turn out to be much easier. Do you try to avoid
	making forecasts? Because Elon doesn't avoid them, right? Heads of car companies in the past have
	not avoided it either. Ford and other places have made predictions that we're going to solve
	at level four driving by 2020, 2021, whatever. They're all kind of backtracking that prediction.
	Are you, as an AI person, do you for yourself privately make predictions or do they get in
	the way of your actual ability to think about a thing? Yeah, I would say what's easy to say is
	that this problem is tractable and that's an easy prediction to make. It's tractable. It's going to
	work. Yes. It's just really hard. Some things turn out to be harder and some things turn out to be
	easier. It definitely feels tractable and it feels like at least the team at Tesla,
	which is what I saw internally, is definitely on track to that. How do you form a strong
	representation that allows you to make a prediction about tractability? You're the leader of a lot of
	humans. You have to say this is actually possible. How do you build up that intuition? It doesn't
	have to be even driving. It could be other tasks. What difficult tasks did you work on in your life?
	Classification, achieving certain, just an image net, certain level of superhuman level performance.
	Yeah, expert intuition. It's just intuition. It's belief.
	So just thinking about it long enough, studying, looking at sample data, like you said, driving.
	My intuition is really flawed on this. I don't have a good intuition about tractability.
	It could be anything. It could be solvable. The driving task could be
	simplified into something quite trivial. The solution to the problem would be quite trivial.
	At scale, more and more cars driving perfectly might make the problem much easier. The more
	cars you have driving, people learn how to drive correctly, not correctly, but in a way that's more
	optimal for a heterogeneous system of autonomous and semi-autonomous and manually driven cars.
	That could change stuff. Then again, also I've spent a ridiculous number of hours just staring
	at pedestrians crossing streets, thinking about humans. It feels like the way we use our eye
	contact, it sends really strong signals. There's certain quirks and edge cases of behavior. Of
	course, a lot of the fatalities that happen have to do with drunk driving and both on the
	pedestrian side and the driver side. There's that problem of driving at night and all that kind of.
	It's like the space of possible solutions to autonomous driving includes so many human factor
	issues that it's almost impossible to predict. There could be super clean, nice solutions.
	I would say definitely like to use a game analogy, there's some fog of war,
	but you definitely also see the frontier of improvement. You can measure historically how
	much you've made progress. I think, for example, at least what I've seen in roughly five years at
	Tesla, when I joined, it barely kept lane on the highway. I think going up from Palo Alto to SF
	was like three or four interventions. Anytime the road would do anything geometrically or turn too
	much, it would just not work. Going from that to a pretty competent system in five years and seeing
	what happens also under the hood and what the scale of which the team is operating now with
	respect to data and compute and everything else is just massive progress. You're climbing a mountain
	and it's fog, but you're making a lot of progress. It's fog. You're making progress and you see what
	the next directions are and you're looking at some of the remaining challenges and they're not
	perturbing you and they're not changing your philosophy and you're not contorting yourself.
	You're like, actually, these are the things that we still need to do. Yeah, the fundamental
	components of solving the problem seem to be there from the data engine to the compute to the
	compute on the car to the compute for the training, all that kind of stuff.
	So you've done over the years, you've been at Tesla, you've done a lot of amazing
	breakthrough ideas and engineering, all of it from the data engine to the human side, all of it.
	Can you speak to why you chose to leave Tesla? Basically, as I described that, Ren, I think over
	time during those five years, I've gotten myself into a bit of a managerial position.
	Most of my days were meetings and growing the organization and making decisions about high
	level strategic decisions about the team and what it should be working on and so on. It's like a
	corporate executive role and I can do it. I think I'm okay at it, but it's not fundamentally what I
	enjoy. I think when I joined, there was no computer vision team because Tesla was just going from the
	transition of using Mobileye, a third party vendor for all of its computer vision, to having to
	build its computer vision system. So when I showed up, there were two people training deep neural
	networks and they were training them at a computer at their legs. They were doing some kind of basic
	classification task. Yeah. And so I kind of grew that into what I think is a fairly respectable
	deep learning team, a massive compute cluster, a very good data annotation organization.
	And I was very happy with where that was. It became quite autonomous. And so I kind of
	stepped away and I'm very excited to do much more technical things again. Yeah. And kind of like,
	we focus on AGI. What was that soul searching like? Cause you took a little time off and think like
	what, how many mushrooms did you take? No, I'm just kidding. I mean, what was going through your mind?
	The human lifetime is finite. Yeah. You did a few incredible things here. You're one of the best
	teachers of AI in the world. You're one of the best. And I don't mean that I mean that in the
	best possible way. You're one of the best tinkerers in the AI world, meaning like understanding the
	fundamentals of how something works by building it from scratch and playing with it with the
	basic intuitions. It's like Einstein, Feynman, we're all really good at this kind of stuff.
	Like small example of a thing to play with it, to try to understand it. So that, and obviously now
	with Tessa, you helped build a team of machine learning, like engineers and assistant that
	actually accomplishes something in the real world. So given all that, like what was the soul searching
	like? Well, it was hard because obviously I love the company a lot and I love Elon, I love Tesla.
	It was always hard to leave. I love the team basically. But yeah, I think actually I will be
	potentially like interested in revisiting it. Maybe coming back at some point,
	working in Optimus, working in AGI at Tesla. I think Tesla is going to do incredible things.
	It's basically like, it's a massive large scale robotics kind of company with a ton of in-house
	talent for doing really incredible things. And I think human robots are going to be amazing.
	I think autonomous transportation is going to be amazing. All this is happening at Tesla. So I
	think it's just a really amazing organization. So being part of it and helping it along, I think
	was very, basically I enjoyed that a lot. Yeah, it was basically difficult for those reasons because
	I love the company. But I'm happy to potentially at some point come back for Act 2. But I felt
	like at this stage, I built the team, it felt autonomous and I became a manager and I wanted
	to do a lot more technical stuff. I wanted to learn stuff. I wanted to teach stuff. And I just
	kind of felt like it was a good time for a change of pace a little bit. What do you think is
	the best movie sequel of all time, speaking of part two? Because most of them suck. Movie sequels?
	Movie sequels, yeah. And you tweet about movies. So just in a tiny tangent,
	what's a favorite movie sequel? Godfather part two. Are you a fan of Godfather? Because you
	didn't even tweet or mention the Godfather. Yeah, I don't love that movie. I know it has a
	huge follow-up. We're going to edit that out. We're going to edit out the hate towards the Godfather.
	How dare you disrespect- I think I will make a strong statement. I don't know why.
	I don't know why, but I basically don't like any movie before 1995. Something like that.
	Didn't you mention Terminator 2? Okay. Okay. That's like Terminator 2 was
	a little bit later, 1990. No, I think Terminator 2 was in the 80s.
	And I like Terminator 1 as well. So, okay. So like few exceptions, but by and large,
	for some reason, I don't like movies before 1995 or something. They feel very slow. The camera is
	like zoomed out. It's boring. It's kind of naive. It's kind of weird.
	And also Terminator was very much ahead of its time.
	Yes. And the Godfather, there's like no AGI.
	I mean, but you have Good Will Hunting was one of the movies you mentioned,
	and that doesn't have any AGI either. I guess it has mathematics.
	Yeah. I guess occasionally I do enjoy movies that don't feature-
	Or like Anchorman. That's- Anchorman is so good.
	I don't understand. Speaking of AGI, because I don't understand why Will Ferrell is so funny.
	It doesn't make sense. It doesn't compute. There's just something about him.
	And he's a singular human because you don't get that many comedies these days. And I wonder if
	it has to do about the culture or the machine of Hollywood, or does it have to do with just
	we got lucky with certain people in comedy. It came together because he is a singular human.
	Yeah. I like his movies.
	That was a ridiculous tangent. I apologize. But you mentioned humanoid robots. So what do you
	think about Optimus, about Tesla Bot? Do you think we'll have robots in the factory and in the home
	in 10, 20, 30, 40, 50 years? Yeah. I think it's a very hard project.
	I think it's going to take a while. But who else is going to build humanoid robots at scale?
	And I think it is a very good form factor to go after because like I mentioned,
	the world is designed for humanoid form factor. These things would be able to operate our machines.
	They would be able to sit down in chairs, potentially even drive cars. Basically,
	the world is designed for humans. That's the form factor you want to invest into and make work over
	time. I think there's another school of thought, which is, okay, pick a problem and design a robot
	to it. But actually designing a robot and getting a whole data engine and everything behind it to
	work is actually an incredibly hard problem. So it makes sense to go after general interfaces
	that, okay, they are not perfect for any one given task, but they actually have the generality
	of just with a prompt with English, able to do something across. And so I think it makes a lot
	of sense to go after a general interface in the physical world. And I think it's a very
	difficult project. I think it's going to take time. But I see no other company that can execute on
	that vision. I think it's going to be amazing. Basically physical labor. If you think transportation
	is a large market, try physical labor. It's insane. But it's not just physical labor. To me,
	the thing that's also exciting is social robotics. So the relationship we'll have on different levels
	with those robots. That's why I was really excited to see Optimus. People have criticized me
	for the excitement. But I've worked with a lot of research labs that do humanoid-legged robots,
	Boston Dynamics, Unitree. There's a lot of companies that do legged robots.
	But the elegance of the movement is a tiny, tiny part of the big picture. So integrating the two
	big exciting things to me about Tesla doing humanoid or any legged robots is clearly integrating
	into the data engine. So the data engine aspect, so the actual intelligence for the perception and
	the control and the planning and all that kind of stuff, integrating into the fleet that you
	mentioned. And then speaking of fleet, the second thing is the mass manufacturers. Just knowing
	culturally driving towards a simple robot that's cheap to produce at scale and doing that well,
	having experience to do that well, that changes everything. That's a very different culture
	and style than Boston Dynamics, who by the way, those robots are just the way they move.
	It'll be a very long time before Tesla can achieve the smoothness of movement,
	but that's not what it's about. It's about the entirety of the system, like we talked about,
	the data engine and the fleet. That's super exciting. Even the initial models. But that,
	too, was really surprising that in a few months you can get a prototype.
	The reason that happened very quickly is, as you alluded to, there's a ton of copy paste from
	what's happening on the autopilot. A lot. The amount of expertise that came out of the woodworks
	at Tesla for building the human robot was incredible to see. Basically, Elon said at one
	point, we're doing this. And then next day, basically, all these CAD models started to appear.
	People talk about the supply chain and manufacturing. People showed up with
	screwdrivers and everything the other day and started to put together the body. I was like,
	whoa. All these people exist at Tesla. Fundamentally, building a car is actually
	not that different from building a robot. That is true, not just for the hardware pieces. Also,
	let's not forget hardware, not just for a demo, but manufacturing of that hardware at scale.
	It is a whole different thing. But for software as well, basically, this robot currently thinks
	it's a car. It's going to have a midlife crisis at some point. It thinks it's a car. Some of the
	earlier demos, actually, we were talking about potentially doing them outside in the parking lot
	because that's where all of the computer vision was working out of the box instead of inside.
	All the operating system, everything just copy pastes. Computer vision mostly copy pastes. You
	have to retrain the neural nets, but the approach and everything and data engine and offline
	trackers and the way we go about the occupancy tracker and so on, everything copy pastes. You
	just need to retrain the neural nets. Then the planning control, of course, has to change quite
	a bit. But there's a ton of copy paste from what's happening at Tesla. If you were to go
	with the goal of like, okay, let's build a million human robots and you're not Tesla,
	that's a lot to ask. If you're Tesla, it's actually like, it's not that crazy.
	Yes. Then the follow-up question is then how difficult, just like with driving,
	how difficult is the manipulation task such that it can have an impact at scale? I think
	depending on the context, the really nice thing about robotics is that unless you do a
	manufacturing and that kind of stuff, is there's more room for error. Driving is so safety critical
	and also time critical. A robot is allowed to move slower, which is nice.
	Yes. I think it's going to take a long time, but the way you want to structure the development is
	you need to say, okay, it's going to take a long time. How can I set up the product development
	roadmap so that I'm making revenue along the way? I'm not setting myself up for a zero one
	loss function where it doesn't work until it works. You don't want to be in that position.
	You want to make it useful almost immediately, and then you want to slowly deploy it
	and at scale. And you want to set up your data engine, your improvement loops, the telemetry,
	the evaluation, the harness and everything. And you want to improve the product over time
	incrementally and you're making revenue along the way. That's extremely important because otherwise
	you cannot build these large undertakings just like don't make sense economically.
	And also from the point of view of the team working on it, they need the dopamine along the way.
	They're not just going to make a promise about this being useful. This is going to change the
	world in 10 years when it works. This is not where you want to be. You want to be in a place
	like I think Autopilot is today where it's offering increased safety and convenience of driving today.
	People pay for it. People like it. People will purchase it. And then you also have the greater
	mission that you're working towards. And you see that. So the dopamine for the team,
	that was a source of happiness and satisfaction. Yes, 100%. You're deploying this. People like it.
	People drive it. People pay for it. They care about it. There's all these YouTube videos.
	Your grandma drives it. She gives you feedback. People like it. People engage with it. You engage
	with it. Huge. Do people that drive Teslas recognize you and give you love? Like, hey, thanks for this
	nice feature that it's doing. Yeah, I think the tricky thing is like some people really love you.
	Some people, unfortunately, like you're working on something that you think is extremely valuable,
	useful, etc. Some people do hate you. There's a lot of people who like me and the team and
	the whole project. And I think Tesla drivers, many cases they're not actually. Yeah, that's
	actually makes me sad about humans or the current ways that humans interact. I think that's actually
	fixable. I think humans want to be good to each other. I think Twitter and social media is part
	of the mechanism that actually somehow makes the negativity more viral, that it doesn't deserve
	disproportionately add a viral boost to the negativity. But I wish people would just get
	excited about, so suppress some of the jealousy, some of the ego and just get excited for others.
	And then there's a karma aspect to that. You get excited for others, they'll get excited for you.
	Same thing in academia. If you're not careful, there is a dynamical system there.
	If you think of in silos and get jealous of somebody else being successful, that actually,
	perhaps counterintuitively, leads to less productivity of you as a community and you
	individually. I feel like if you keep celebrating others, that actually makes you more successful.
	Yeah. I think people haven't, depending on the industry, haven't quite learned that yet.
	Some people are also very negative and very vocal. They're very prominently featured,
	but actually there's a ton of people who are cheerleaders, but they're silent cheerleaders.
	And when you talk to people just in the world, they will tell you, it's amazing, it's great.
	Especially people who understand how difficult it is to get this stuff working. People who have
	built products and makers, entrepreneurs, making this work and changing something
	is incredibly hard. Those people are more likely to cheerlead you.
	Well, one of the things that makes me sad is some folks in the robotics community
	don't do the cheerleading and they should because they know how difficult it is. Well,
	they actually sometimes don't know how difficult it is to create a product that's scale. They
	actually deploy it in the real world. A lot of the development of robots and AI system is done on
	very specific small benchmarks as opposed to real world conditions.
	Yes. Yeah. I think it's really hard to work on robotics in an academic setting.
	Or AI systems that apply in the real world. You've criticized, you flourished and loved for time the
	ImageNet, the famed ImageNet data set. And I've recently had some words of criticism that the
	academic research ML community gives a little too much love still to the ImageNet or like
	those kinds of benchmarks. Can you speak to the strengths and weaknesses of data sets
	used in machine learning research? Actually, I don't know that I recall
	a specific instance where I was unhappy or criticizing ImageNet. I think ImageNet has
	been extremely valuable. It was basically a benchmark that allowed the deep learning community
	to demonstrate that deep neural networks actually work. There's a massive value in that.
	I think ImageNet was useful, but basically it's become a bit of an MNIST at this point.
	MNIST is like little 228 by 28 grayscale digits. There's a joke data set that everyone just crushes.
	There's still papers written on MNIST though, right?
	Maybe they shouldn't.
	Strong papers. Like papers that focus on how do we learn with a small amount of data, that kind of
	stuff. Yeah. I could see that being helpful, but not in mainline computer vision research anymore,
	of course. I think the way I've heard you somewhere, maybe I'm just imagining things,
	but I think you said ImageNet was a huge contribution to the community for a long time,
	and now it's time to move past those kinds of... Well, ImageNet has been crushed. I'm
	the error rates are... Yeah, we're getting like 90% accuracy in 1000 classification way prediction,
	and I've seen those images and it's like really high. That's really good. If I remember correctly,
	the top five error rate is now like 1% or something.
	Given your experience with a gigantic real world data set, would you like to see benchmarks move
	in a certain directions that the research community uses?
	Unfortunately, I don't think academics currently have the next ImageNet.
	I think we've crushed MNIST. We've basically crushed ImageNet, and there's no next big
	benchmark that the entire community rallies behind and uses for further development of these
	networks. Yeah. What are what it takes for a data set to captivate the imagination of everybody,
	where they all get behind it? That could also need a leader, right? Yeah. Somebody with popularity.
	Yeah. Why did ImageNet take off? Or is it just the accident of history?
	It was the right amount of difficult. It was the right amount of difficult and simple,
	and interesting enough, it just kind of like it was the right time for that kind of a data set.
	Question from Reddit. What are your thoughts on the role that synthetic data and game engines
	will play in the future of neural net model development? I think as neural nets converge
	to humans, the value of simulation to neural nets will be similar to the value of simulation to
	humans. So people use simulation because they can learn something in that kind of a system
	without having to actually experience it. But are you referring to the simulation we do in our head?
	No, sorry, simulation. I mean like video games or other forms of simulation for various professionals.
	So let me push back on that because maybe there's simulation that we do in our heads.
	Like, simulate if I do this, what do I think will happen?
	Okay. That's like internal simulation. Yeah. Internal. Isn't that what we're doing?
	Assuming before we act? Oh yeah. But that's independent from like the use of simulation in
	the sense of like computer games or using simulation for training set creation or-
	Is it independent or is it just loosely correlated? Because like, isn't that useful to do like
	counterfactual or like edge case simulation to like, you know, what happens if there's a nuclear war?
	What happens if there's, you know, like those kinds of things?
	Yeah, that's a different simulation from like Unreal Engine. That's how I interpreted the question.
	Ah, so like simulation of the average case. What's Unreal Engine?
	What do you mean by Unreal Engine? So simulating a world, physics of that world,
	why is that different? Like, because you also can add behavior to that world
	and you could try all kinds of stuff, right? You could throw all kinds of weird things into it.
	Unreal Engine is not just about simulating, I mean, I guess it is about simulating the physics
	of the world. It's also doing something with that. Yeah. The graphics, the physics, and the
	agents that you put into the environment and stuff like that. Yeah. See, I think you,
	I feel like you said that it's not that important, I guess, for the future of AI development.
	Is that correct to interpret it that way? I think humans use simulators for,
	humans use simulators and they find them useful. And so computers will use simulators and find them
	useful. Okay. So you're saying it's not that, I don't use simulators very often. I play a video
	game every once in a while, but I don't think I derive any wisdom about my own existence from
	those video games. It's a momentary escape from reality versus a source of wisdom about reality.
	So I think that's a very polite way of saying simulation is not that useful.
	Yeah, maybe not. I don't see it as like a fundamental, really important part of like
	training neural nets currently. But I think as neural nets become more and more powerful,
	I think you will need fewer examples to train additional behaviors. And simulation is, of course,
	there's a domain gap in a simulation that it's not the real world, it's slightly something different.
	But with a powerful enough neural net, you need, the domain gap can be bigger, I think,
	because neural net will sort of understand that even though it's not the real world, it like has
	all this high level structure that I'm supposed to be learning from. So the neural net will actually,
	yeah, it will be able to leverage the synthetic data better by closing the gap,
	by understanding in which ways this is not real data.
	Exactly.
	Right, I do better questions next time. That was a question, but I'm just kidding. All right.
	So is it possible, do you think, speaking of MNIST, to construct neural nets and training
	processes that require very little data? So we've been talking about huge data sets like
	the internet for training. I mean, one way to say that is, like you said, like the querying itself
	is another level of training, I guess, and that requires a little data. But do you see any value
	in doing research and kind of going down the direction of can we use very little data to train,
	to construct a knowledge base?
	100%. I just think like at some point you need a massive data set. And then when you pre-train
	your massive neural net and get something that is like a GPT or something, then you're able to be
	very efficient at training any arbitrary new task. So a lot of these GPTs, you can do tasks like
	sentiment analysis or translation or so on just by being prompted with very few examples. Here's the
	kind of thing I want you to do. Here's an input sentence, here's the translation into German.
	Input sentence, translation to German. Input sentence blank, and the neural net will complete
	the translation to German just by looking at sort of the example you've provided. And so that's an
	example of a very few shot learning in the activations of the neural net instead of the
	weights of the neural net. And so I think basically just like humans, neural nets will become very
	data efficient at learning any other new task. But at some point you need a massive data set to
	pre-train your network. To get that, and probably we humans have something like that.
	Do we have something like that? Do we have a passive in the background model constructing
	thing that just runs all the time in a self-supervised way? We're not conscious of it.
	I think humans definitely, I mean, obviously we learn a lot during our lifespan, but also we have
	a ton of hardware that helps us at initialization coming from sort of evolution. And so I think
	that's also a really big component. A lot of people in the field, I think they just talk about
	the amounts of like seconds and the, you know, that a person has lived pretending that this is
	a WLRRSA, sort of like a zero initialization of a neural net. And it's not like you can look at a
	lot of animals, like for example, zebras, zebras get born and they see and they can run. There's
	zero train data in their lifespan. They can just do that. So somehow I have no idea how evolution
	has found a way to encode these algorithms and these neural net initializations that are extremely
	good into ATCGs. And I have no idea how this works, but apparently it's possible because
	here's a proof by existence. There's something magical about going from a single cell to an
	organism that is born to the first few years of life. I kind of like the idea that the reason we
	don't remember anything about the first few years of our life is that it's a really painful process.
	Like it's a very difficult, challenging training process. Like intellectually, like
	and maybe, yeah, I mean, I don't, why don't we remember any of that? There might be some crazy
	training going on and that maybe that's the background model training that is very painful.
	And so it's best for the system once it's trained not to remember how it's constructed.
	I think it's just like the hardware for long-term memory is just not fully developed.
	I kind of feel like the first few years of infants is not actually like learning,
	it's brain maturing. We're born premature. There's a theory along those lines because of the
	birth canal and the swallowing of the brain. And so we're born premature and then the first few
	years we're just, the brain is maturing and then there's some learning eventually.
	That's my current view on it. What do you think, do you think neural nets can have long-term memory?
	Like that approach is something like humans. Do you think they need to be another meta
	architecture on top of it to add something like a knowledge base that learns facts about the world
	and all that kind of stuff? Yes, but I don't know to what extent it will be explicitly constructed.
	It might take unintuitive forms where you are telling the GPT like, hey, you have a declarative
	memory bank to which you can store and retrieve data from. And whenever you encounter some
	information that you find useful, just save it to your memory bank. And here's an example of
	something you have retrieved and how you say it and here's how you load from it. You just say load,
	whatever, you teach it in text, in English, and then it might learn to use a memory bank from that.
	Oh, so the neural net is the architecture for the background model, the base thing,
	and then everything else is just on top of it. That's pretty easy to do.
	It's not just text, right? You're giving it gadgets and gizmos. So you're teaching some kind
	of a special language by which it can save arbitrary information and retrieve it at a later
	time. And you're telling it about these special tokens and how to arrange them to use these
	interfaces. It's like, hey, you can use a calculator. Here's how you use it. Just do
	53 plus 41 equals. And when equals is there, a calculator will actually read out the answer
	and you don't have to calculate it yourself. And you just tell it in English, this might actually
	work. Do you think in that sense, Godot is interesting, the DeepMind system, that it's not
	just new language, but actually throws it all in the same pile, images, actions, all that kind
	of stuff. That's basically what we're moving towards. Yeah, I think so. So Godot is very much a
	kitchen sink approach to reinforcement learning in lots of different environments with a single
	fixed transformer model, right? I think it's a very early result in that realm, but I think,
	yeah, it's along the lines of what I think things will eventually look like.
	So this is the early days of a system that eventually will look like this from a
	rich, sudden perspective. Yeah, I'm not super huge fan of, I think, all these interfaces that
	look very different. I would want everything to be normalized into the same API. So for example,
	screen pixels, very same API, instead of having different world environments that have very
	different physics and joint configurations and appearances and whatever, and you're having some
	kind of special tokens for different games that you can plug. I'd rather just normalize everything
	to a single interface so it looks the same to the neural net, if that makes sense. So it's all
	going to be pixel-based pong in the end. I think so. Okay. Let me ask you about your own personal
	life. A lot of people want to know you're one of the most productive and brilliant people
	in the history of AI. What is a productive day in the life of Andrej Kapati look like?
	What time do you wake up? Because imagine some kind of dance between the average productive day
	and a perfect productive day. So the perfect productive day is the thing we strive towards,
	and the average is what it converges to, given all the mistakes and human eventualities and so on.
	So what time do you wake up? Are you a morning person? I'm not a morning person. I'm a night
	owl for sure. Is it stable or not? It's semi-stable, like eight or nine or something like that.
	During my PhD, it was even later, I used to go to sleep usually at 3 a.m. I think the a.m. hours
	are precious and very interesting time to work because everyone is asleep.
	At 8 a.m. or 7 a.m., the East Coast is awake. So there's already activity, there's already some
	text messages, whatever, there's stuff happening. You can go on some news website and there's stuff
	happening. It's distracting. At 3 a.m., everything is totally quiet. And so you're not going to be
	bothered and you have solid chunks of time to do work. So I like those periods, night owl by
	default. And then I think like productive time, basically, what I like to do is you need to build
	some momentum on the problem without too much distraction. And you need to load your RAM,
	your working memory with that problem. And then you need to be obsessed with it when you're taking
	shower, when you're falling asleep. You need to be obsessed with the problem and it's fully in
	your memory and you're ready to wake up and work on it right there. So is this in a scale, temporal
	scale of a single day or a couple of days, a week, a month? So I can't talk about one day,
	basically, in isolation because it's a whole process. When I want to get productive in the
	problem, I feel like I need a span of a few days where I can really get in on that problem. And I
	don't want to be interrupted. And I'm going to just be completely obsessed with that problem.
	And that's where I do most of my good workouts. You've done a bunch of cool, like little projects
	in a very short amount of time very quickly. So that requires you just focusing on it.
	Yeah, basically, I need to load my working memory with the problem. And I need to be productive
	because there's always a huge fixed cost to approaching any problem. I was struggling with
	this, for example, at Tesla because I want to work on small side project. But okay, you first need to
	figure out, okay, I need to SSH into my cluster. I need to bring up a VS code editor so I can work
	on this. I run into some stupid error because of some reason. You're not at a point where you can
	be just productive right away. You are facing barriers. And so it's about really removing all
	of that barrier and you're able to go into the problem and you have the full problem loaded in
	your memory. And somehow avoiding distractions of all different forms, like news stories, emails,
	but also distractions from other interesting projects that you previously worked on or
	currently working on and so on. You just want to really focus your mind. And I mean, I can take
	some time off for distractions and in between, but I think it can't be too much. Most of your day is
	sort of spent on that problem. And then I drink coffee, I have my morning routine, I look at some
	news, Twitter, Hacker News, Wall Street Journal, et cetera. It's great. So basically, you wake up,
	you have some coffee. Are you trying to get to work as quickly as possible? Are you taking this diet
	of what the hell is happening in the world first? I do find it interesting to know about the world.
	I don't know that it's useful or good, but it is part of my routine right now. So I do read through
	a bunch of news articles and I want to be informed. And I'm suspicious of it. I'm suspicious of the
	practice, but currently that's where I am. Oh, you mean suspicious about the positive effect
	of that practice on your productivity and your wellbeing? My wellbeing psychologically, yeah.
	And also on your ability to deeply understand the world because there's a bunch of sources of
	information. You're not really focused on deeply integrating. Yeah, it's a little distracting.
	In terms of a perfectly productive day, for how long of a stretch of time in one session do you
	try to work and focus on a thing? A couple of hours, is it one hour, is it 30 minutes, is it
	10 minutes? I can probably go a small few hours and then I need some breaks in between for food
	and stuff. Yeah, but I think it's still really hard to accumulate hours. I was using a tracker
	that told me exactly how much time I spent coding any one day. And even on a very productive day,
	I still spent only like six or eight hours. And it's just because there's so much padding,
	commute, talking to people, food, et cetera. There's like the cost of life, just living
	and sustaining and homeostasis and just maintaining yourself as a human is very high.
	And there seems to be a desire within the human mind to participate in society that creates that
	padding. Because the most productive days I've ever had is just completely from start to finish
	is tuning out everything and just sitting there. And then you could do more than six and eight
	hours. Is there some wisdom about what gives you strength to do tough days of long focus?
	Yeah, just like whenever I get obsessed about a problem, something just needs to work,
	something just needs to exist. It needs to exist. So you're able to deal with bugs and programming
	issues and technical issues and design decisions that turn out to be the wrong ones. You're able
	to think through all of that given that you want to think to exist. Yeah, it needs to exist. And
	then I think to me also a big factor is are other humans are going to appreciate it? Are they going
	to like it? That's a big part of my motivation. If I'm helping humans and they seem happy,
	they say nice things, they tweet about it or whatever, that gives me pleasure because I'm
	doing something useful. So you do see yourself sharing it with the world. Whether it's on GitHub
	or through a blog post or through videos. Yeah, I was thinking about it. Suppose I did all these
	things but did not share them. I don't think I would have the same amount of motivation that
	I can build up. You enjoy the feeling of other people gaining value and happiness from the stuff
	you've created. Yeah. What about diet? I saw you played with intermittent fasting. Do you fast?
	Does that help? I played with everything.
	With the things you played, what's been most beneficial to your ability to mentally focus
	on a thing and just mental productivity and happiness? You still fast? Yeah, I still fast,
	but I do intermittent fasting. But really what it means at the end of the day is I skip breakfast.
	So I do 18, 6 roughly by default when I'm in my steady state. If I'm traveling or doing something
	else, I will break the rules. But in my steady state, I do 18, 6. So I eat only from 12 to 6.
	Not a hard rule and I break it often, but that's my default. And then yeah, I've done a bunch of
	random experiments. For the most part right now, where I've been for the last year and a half,
	I want to say, is I'm plant-based or plant-forward. I heard plant-forward. It sounds better.
	What does that mean exactly? I don't actually know what the difference is,
	but it sounds better in my mind. But it just means I prefer plant-based food.
	Raw or cooked? I prefer cooked and plant-based.
	So plant-based, forgive me, I don't actually know how wide the category of plant entails.
	Well, plant-based just means that you're not militant about it and you can flex.
	You just prefer to eat plants and you're not trying to influence other people.
	And if you come to someone's house party and they serve you a steak that they're really proud of,
	you will eat it. That's beautiful. I'm on the flip side of that, but I'm very sort of flexible.
	Have you tried doing one meal a day? I have accidentally, not consistently,
	but I've accidentally had that. I don't like it. I think it makes me feel not good. It's too much,
	too much of a hit. Yeah.
	And so currently I have about two meals a day, 12 and six.
	I do that nonstop. I'm doing it now. I do one meal a day.
	It's interesting. It's an interesting feeling. Have you ever fasted longer than a day?
	Yeah, I've done a bunch of water fasts because I was curious what happens.
	Anything interesting? Yeah, I would say so. I mean,
	what's interesting is that you're hungry for two days and then starting day three or so,
	you're not hungry. It's such a weird feeling because you haven't eaten in a few days and
	you're not hungry. Isn't that weird?
	It's really weird. One of the many weird things about human biology,
	is figure something out. It finds another source of energy or something like that,
	or relaxes the system. I don't know how that works.
	The body is like, you're hungry, you're hungry. And then it just gives up. It's like,
	okay, I guess we're fasting now. There's nothing. And then it just focuses on trying to make you
	not hungry and not feel the damage of that and trying to give you some space to figure out the
	food situation. Are you still to this day most productive at night?
	I would say I am, but it is really hard to maintain my PhD schedule,
	especially when I was working at Tesla and so on. It's a non-starter.
	But even now, people want to meet for various events. Society lives in a certain period of time
	and you sort of have to work with that.
	It's hard to do a social thing and then after that return and do work.
	Yeah. It's just really hard.
	That's why I try when I do social things, I try not to do too much drinking so I can return
	and continue doing work. But at Tesla, is there a convergence, Tesla, but any company,
	is there a convergence towards a schedule? Or is there more? Is that how humans behave
	when they collaborate? I need to learn about this. Do they try to keep a consistent schedule
	where you're all awake at the same time? I do try to create a routine and I try to
	create a steady state in which I'm comfortable in. I have a morning routine, I have a day routine,
	I try to keep things to a steady state and things are predictable. And then your body just
	sticks to that. And if you try to stress that a little too much, it will create,
	when you're traveling and you're dealing with jet lag, you're not able to really ascend
	to where you need to go. Yeah. That's what you're doing with humans with the habits and stuff.
	What are your thoughts on work-life balance throughout a human lifetime?
	So Tesla in part was known for pushing people to their limits in terms of what they're able to do,
	in terms of what they're trying to do, in terms of how much they work, all that kind of stuff.
	Yeah. I will say Tesla gets a little too much bad rep for this because what's happening is Tesla,
	it's a bursty environment. So I would say the baseline, my only point of reference is Google,
	where I've interned three times and I saw what it's like inside Google and DeepMind. I would
	say the baseline is higher than that, but then there's a punctuated equilibrium where once in
	a while there's a fire and people work really hard. And so it's spiky and bursty and then all
	the stories get collected. About the bursts. And then it gives the appearance of total insanity,
	but actually it's just a bit more intense environment and there are fires and sprints.
	And so I think definitely though I would say it's a more intense environment than something
	you would get. But in your personal, forget all of that, just in your own personal life,
	what do you think about the happiness of a human being? A brilliant person like yourself,
	about finding a balance between work and life or is it such a thing, not a good thought experiment?
	Yeah, I think balance is good, but I also love to have sprints that are out of distribution.
	And that's when I think I've been pretty creative as well. Sprints out of distribution means that
	most of the time you have a quote unquote balance. I have balance most of the time.
	I like being obsessed with something once in a while. Once in a while is what? Once a week,
	once a month, once a year? Yeah, probably like say once a month or something. Yeah.
	And that's when we get a new GitHub repo for monitoring. Yeah, that's when you really care
	about a problem. It must exist. This will be awesome. You're obsessed with it. And now you
	can't just do it on that day. You need to pay the fixed cost of getting into the groove. And then
	you need to stay there for a while and then society will come and they will try to mess with you and
	they will try to distract you. Yeah. The worst thing is a person who's like, I just need five
	minutes of your time. Yeah. The cost of that is not five minutes and society needs to change how
	it thinks about it. Just five minutes of your time. Right. It's never just one minute. Just
	30 seconds. Just a quick thing. What's the big deal? Why are you being so... Yeah, no.
	What's your computer setup? What's like the perfect... Are you somebody that's flexible
	to no matter what? Laptop, four screens. Yeah. Or do you prefer a certain setup that you're most
	productive? I guess the one that I'm familiar with is one large screen, 27 inch, and my laptop
	on the side. What operating system? I do Macs. That's my primary. For all tasks? I would say
	OS X, but when you're working on deep learning, everything is Linux. You're SSH'd into a cluster
	and you're working remotely. But what about the actual development? Like they're using the IDE?
	I think a good way is you just run VS code, my favorite editor right now, on your Mac,
	but you have a remote folder through SSH. The actual files that you're manipulating
	are on the cluster somewhere else. What's the best IDE? VS code. What else do people... I use
	Emacs still. That's cool. It may be cool. I don't know if it's maximum productivity.
	What do you recommend in terms of editors? You worked a lot of software engineers. Editors for
	Python, C++, machine learning applications. I think the current answer is VS code. Currently,
	I believe that's the best IDE. It's got a huge amount of extensions. It has GitHub Copilot
	integration, which I think is very valuable. What do you think about the Copilot integration? I
	was actually... I got to talk a bunch with Guido Narrazzon, who's a creative Python, and he loves
	Copilot. He programs a lot with it. Do you? Yeah, I use Copilot. I love it. It's free for me,
	but I would pay for it. Yeah, I think it's very good. The utility that I found with it was...
	I would say there's a learning curve, and you need to figure out when it's helpful and when to pay
	attention to its outputs and when it's not going to be helpful, where you should not pay attention
	to it. Because if you're just reading at suggestions all the time, it's not a good way of interacting
	with it. But I think I was able to mold myself to it. I find it's very helpful, number one,
	copy, paste, and replace some parts. When the pattern is clear, it's really good at completing
	the pattern. And number two, sometimes it suggests APIs that I'm not aware of. It tells you about
	something that you didn't know. And that's an opportunity to discover and use it again.
	It's an opportunity to... I would never take Copilot code as given. I almost always copy
	a copy paste into a Google search, and you see what this function is doing. And then you're like,
	oh, it's actually exactly what I need. Thank you, Copilot. So you learn something. It's in part a
	search engine, part maybe getting the exact syntax correctly that once you see it, it's that NP
	hard thing. Once you see it, you know it's correct, but you yourself struggle. You can verify
	efficiently, but you can't generate efficiently. And Copilot really, I mean, it's autopilot for
	programming, right? And currently it's doing the link following, which is like the simple copy,
	paste, and sometimes suggest. But over time, it's going to become more and more autonomous.
	And so the same thing will play out in not just coding, but actually across many,
	many different things probably. Coding is an important one, right? Like writing programs.
	How do you see the future of that developing? The program synthesis, like being able to write
	programs that are more and more complicated. Because right now it's human supervised in
	interesting ways. It feels like the transition will be very painful.
	My mental model for it is the same thing will happen as with the autopilot. So currently
	it's doing link following, it's doing some simple stuff. And eventually we'll be doing autonomy and
	people will have to intervene less and less. And there could be like testing mechanisms.
	Like if it writes a function and that function looks pretty damn correct, but how do you know
	it's correct? Because you're getting lazier and lazier as a programmer. Like your ability to,
	because like little bugs, but I guess it won't make little mistakes.
	No, it will. Copilot will make off by one subtle bugs. It has done that to me.
	But do you think future systems will? Or is it really the off by one is actually a fundamental
	challenge of programming? In that case, it wasn't fundamental. And I think things can improve, but
	yeah, I think humans have to supervise. I am nervous about people not supervising what comes out
	and what happens to, for example, the proliferation of bugs in all of our systems.
	I'm nervous about that, but I think there will probably be some other copilots for bug finding
	and stuff like that at some point. Cause there'll be like a lot more automation for.
	It's like a program, a copilot that generates a compiler, one that does a linter, one that does
	like a type checker. It's a committee of like a GPT sort of like. And then there'll be like a manager
	for the committee. And then there'll be somebody that says a new version of this is needed. We need
	to regenerate it. Yeah. There were 10 GPTs. They were forwarded and gave 50 suggestions. Another
	one looked at it and picked a few that they like. A bug one looked at it and it was like, it's
	probably a bug. They got re-ranked by some other thing. And then a final ensemble GPT comes in.
	It's like, okay, given everything you guys have told me, this is probably the next token.
	The feeling is the number of programmers in the world has been growing and growing very quickly.
	Do you think it's possible that it'll actually level out and drop to like a very low number
	with this kind of world? Cause then you'll be doing software 2.0 programming.
	And you'll be doing this kind of generation of copilot type systems programming,
	but you won't be doing the old school software 1.0 program.
	I don't currently think that they're just going to replace human programmers.
	I'm so hesitant saying stuff like this, right?
	This is going to be replaced in five years. I don't know. It's going to show that this is where
	we thought. Cause I agree with you, but I think we might be very surprised. What's your sense of
	where we stand with language models? Does it feel like the beginning or the middle or the end?
	The beginning, a hundred percent. I think the big question in my mind is for sure GPT will be able
	to program quite well, competently and so on. How do you steer the system? You still have to provide
	some guidance to what you actually are looking for. And so how do you steer it? And how do you
	talk to it? How do you audit it and verify that what is done is correct? And how do you work with
	this? And it's as much not just an AI problem, but a UI UX problem. So beautiful fertile ground for
	so much interesting work for VS code plus plus where it's not just human programming anymore.
	It's amazing. Yeah. So you're interacting with the system. So not just one prompt,
	but it's iterative prompting. You're trying to figure out having a conversation with the system.
	Yeah. That actually, I mean, to me, that's super exciting to have a conversation with the program
	I'm writing. Yeah. Maybe at some point you're just conversing with it. It's like, okay, here's what I
	want to do. Actually this variable, maybe it's not even that low level as variable, but. You can also
	imagine like, can you translate this to C plus plus and back to Python? Yeah, that already kind of
	exists in some. No, but just like doing it as part of the program experience. Like I think I'd like
	to write this function in C plus plus or like you just keep changing for different, uh, different
	programs because of different six, six syntax. Maybe I want to convert this into a functional
	language. And so like you get to become multilingual as a programmer and dance back and forth
	efficiently. Yeah. I mean, I think the UI UX of it though is like still very hard to think through
	because it's not just about writing code on a page. You have an entire developer environment.
	You have a bunch of hardware on it. Uh, you have some environmental variables. You have some scripts
	that are running in a Chrome job. Like there's a lot going on to like working with computers and how
	do these systems set up environment flags and work across multiple machines and set up screen
	sessions and automate different processes. Like how all that works and is auditable by humans and
	so on is like massive question. No, my man. You've built archive sanity. What is archive
	and what is the future of academic research publishing that you would like to see?
	Uh, so archive is this pre print server. So if you have a paper, uh, you can submit it for
	publication to journals or conferences and then wait six months and then maybe get a decision,
	pass or fail, or you can just upload it to archive and then people can tweet about it three minutes
	later and then everyone sees it, everyone reads it and everyone can profit from it, uh, in their own
	way. So you can cite it and it has an official look to it. It feels like a pub, like it feels
	like a publication process. It feels different than you if you just put it in a blog post.
	Oh yeah. Yeah. I mean, it's a paper and usually the bar is higher for something that you would
	expect on archive as opposed to something you would see in a blog post. Well, the culture
	created the bar because you could probably post a pretty crappy picture on the archive.
	Yes. Um, so what, what's that make you feel like? What's that make you feel about peer review?
	So rigorous peer review by two, three experts versus the peer review of the community
	right as it's written. Yeah. Basically I think the community is very well able to peer review
	things very quickly on Twitter. And I think maybe it just has to do something with AI machine
	learning fields specifically though. I feel like things are more easily auditable. Um, and the
	verification is easier potentially than the verification somewhere else. So it's kind of
	like, um, you can think of these, uh, scientific publications as like little blockchains where
	everyone's building on each other's work and setting each other. And you sort of have AI,
	which is kind of like this much faster and loose blockchain, but then you have any one individual
	entry is like very, um, very cheap to make. And then you have other fields where maybe that
	model doesn't make as much sense. Um, and so I think in AI, at least things are pretty easily
	very viable. And so that's why when people upload papers, they're a really good idea and so on,
	people can try it out like the next day and they can be the final arbiter of whether it works or
	not on their problem. And the whole thing just moves significantly faster. So I kind of feel like
	academia still has a place. Sorry, this like conference journal process still has a place,
	but it's sort of like, um, it lags behind, I think. And it's a bit more, um, maybe higher quality
	process. Uh, but it's not sort of the place where you will discover cutting edge work anymore.
	Yeah. It used to be the case when I was starting my PhD, that you go to conferences and journals
	and you discuss all the latest research. Now, when you go to a conference or general, like no
	one discusses anything that's there because it's already like three generations ago irrelevant.
	Yeah. Which makes me sad about like DeepMind, for example, where they, they still, they still
	publish in nature and these big prestigious, I mean, there's still value, I suppose to the prestige
	that comes with these big venues, but the result is that they, they'll announce some breakthrough
	performance and it will take like a year to actually publish the details. I mean,
	and those details in, if they were published immediately, it would inspire the community
	to move in certain directions. Yeah, it would speed up the rest of the community,
	but I don't know to what extent that's part of their objective function also.
	That's true. So it's not just the prestige, a little bit of the delay is, uh, as part of.
	Yeah, they certainly, uh, DeepMind specifically has been, um, working in the regime of having
	a slightly higher quality, basically process and latency and, uh, publishing those papers that way.
	Another question from Reddit. Do you, or have you suffered from imposter syndrome? Being the director
	of AI Tesla, uh, being this person when you're at Stanford, where like the world looks at you
	as the expert in AI to teach, teach the world about machine learning. When I was leaving
	Tesla after five years, I spent a ton of time in meeting rooms. Uh, and you know, I would read
	papers in the beginning when I joined Tesla, I was writing code and then I was writing less
	and less code and I was reading code and then I was reading less and less code. And so this is just
	a natural progression that happens, I think. And, uh, definitely I would say near the tail end.
	That's when it sort of like starts to hit you a bit more that you're supposed to be an expert,
	but actually the source of truth is the code that people are writing, the GitHub and the actual,
	the actual code itself. Uh, and you're not as familiar with that as you used to be.
	And so I would say maybe there's some like insecurity there.
	Yeah, that's actually pretty profound that a lot of the insecurity has to do with not writing the
	code in the computer science space like that, cause that is the truth that, that right there.
	The code is the source of truth, the papers and everything else. It's a high level summary.
	I don't, uh, yeah, just a high level summary, but at the end of the day, you have to read code.
	It's impossible to translate all that code into actual, uh, you know, uh, paper form. Uh, so when,
	when things come out, especially when they have a source code available, that's my favorite place
	to go. So like I said, you're one of the greatest teachers of machine learning AI ever, uh, from CS
	231N to today. What advice would you give to beginners interested in getting into machine
	learning? Beginners are often focused on like what to do. And I think the focus should be more like
	how much you do. So I am kind of like believer on a high level in this 10,000 hours kind of concept
	where you just kind of have to just pick the things where you can spend time and you care about and
	you're interested in. You literally have to put in 10,000 hours of work. Um, it doesn't even like
	matter as much like where you put it and you're, you'll iterate and you'll improve and you'll
	waste some time. I don't know if there's a better way you need to put in 10,000 hours, but I think
	it's actually really nice because I feel like there's some sense of determinism about, uh,
	being an expert at a thing. If you spend 10,000 hours, you can literally pick an arbitrary thing.
	And I think if you spend 10,000 hours of deliberate effort and work, you actually will become an
	expert at it. And so I think it's kind of like a nice thought. Um, and so, uh, basically I would
	focus more on like, are you spending 10,000 hours? That's what I'm focused on. So, and then thinking
	about what kind of mechanisms maximize your likelihood of getting to 10,000 hours, which
	for us silly humans means probably forming a daily habit of like every single day,
	actually doing the thing, whatever helps you. So I do think to a large extent is a psychological
	problem for yourself. Uh, one other thing that I help that I think is helpful for the psychology
	of it is many times people compare themselves to others in the area. I think this is very harmful
	only compare yourself to you from some time ago, like say a year ago, are you better than you
	year ago? This is the only way to think. Um, and I think this, then you can see your progress and
	it's very motivating. That's so interesting that focus on the quantity of hours. Cause I think a
	lot of people, uh, in the beginner stage, but actually throughout get paralyzed, uh, by, uh,
	the choice, like which one do I pick this path or this path? Like they'll literally get paralyzed,
	but like which ID to use. Well, they're worried. Yeah. They'll worried about all these things,
	but the thing is some of the, you will waste time doing something wrong. You will eventually
	figure out it's not right. You will accumulate scar tissue and next time you'll grow stronger
	because next time you'll have the scar tissue and next time you'll learn from it. And now next time
	come to a similar situation, you'll be like, Oh, I, I messed up. I've spent a lot of time working
	on things that never materialized into anything. And I have all that scar tissue and I have some
	intuitions about what was useful, what wasn't useful, how things turned out. Uh, so all those
	mistakes were, uh, were not dead work, you know? So I just think you should, did you just focus on
	working? What have you done? What have you done last week? Uh, that's a good question actually to
	ask for, for a lot of things, not just machine learning. Um, it's a good way to cut the,
	the, I forgot what the term we use, but the fluff, the blubber, whatever the,
	uh, the inefficiencies in life. Uh, what do you love about teaching? You seem to find yourself
	often in the, like draw onto teaching. You're very good at it, but you're also drawn to it.
	I mean, I don't think I love teaching. I love happy humans and happy humans like when I teach.
	I wouldn't say I hate teaching. I tolerate teaching, but it's not like the act of teaching
	that I like. It's, it's that, um, you know, I, I have some, I have something I'm actually okay at
	it. I'm okay at teaching and people appreciate it a lot. And, uh, so I'm just happy to try to be
	helpful and, uh, teaching itself is not like the most, I mean, it's really annoying. It can be
	really annoying, frustrating. I was working on a bunch of lectures just now. I was reminded back to
	my days of 231 and just how much work it is to create some of these materials and make them good.
	The amount of iteration and thought, and you go down blind alleys and just how much you change it.
	So creating something good, um, in terms of like educational value is really hard and, uh, it's not
	fun. It was difficult. So for people to definitely go watch your new stuff, you put out, there are
	lectures where you're actually building the thing like from, like you said, the code is truth. So
	discussing, uh, backpropagation by building it, by looking through it and just the whole thing.
	So how difficult is that to prepare for? I think that's a really powerful way to teach.
	Did you have to prepare for that or are you just live thinking through it?
	I will typically do like say three takes and then I take like the better take. Uh, so I do multiple
	takes and I take some of the better takes and then I just build out a lecture that way. Uh,
	sometimes I have to delete 30 minutes of content because it just went down the alley that I didn't
	like too much. There's a bunch of iteration and it probably takes me, you know, somewhere around
	10 hours to create one hour of content. To get one hour. It's interesting. I mean, uh,
	is it difficult to go back to the basics? Do you draw a lot of like wisdom from going back to the
	basics? Yeah. Going back to backpropagation loss functions, where they come from. And one thing
	I like about teaching a lot honestly is it definitely strengthens your understanding.
	So it's not a purely altruistic activity. It's a way to learn. If you have to explain
	something to someone, uh, you realize you have gaps in knowledge. Uh, and so I even
	surprised myself in those lectures. Like, oh, the result will obviously look at this and then the
	result doesn't look like it. And I'm like, okay, I thought I understood this. Yeah.
	But that's why it's really cool. Literally code, you run it in the notebook and it gives you a
	result and you're like, oh, wow. Yes. And like actual numbers, actual input, actual code.
	Yeah. It's not mathematical symbols, et cetera. The source of truth is the code. It's not slides.
	It's just like, let's build it. It's beautiful. You're a rare human in that sense. Uh, what
	advice would you give to researchers, uh, trying to develop and publish idea that have a big impact
	in the world of AI? So maybe, um, undergrads, maybe early graduate students. Yep. I mean,
	I would say like, they definitely have to be a little bit more strategic than I had to be as a
	PhD student because of the way AI is evolving. It's going the way of physics, where, you know,
	in physics, you used to be able to do experiments on your bench top and everything was great and
	you could make progress. And now you have to work in like LHC or like CERN. And, and so AI is going
	in that direction as well. Um, so there's certain kinds of things that's just not possible to do on
	the bench top anymore. And, uh, I think, um, that didn't used to be the case at the time.
	Do you still think that there's like, GAN type papers to be written where like, uh, like very
	simple idea that requires just one computer to illustrate a simple example? I mean, one example
	that's been very influential recently is diffusion models. The fusion models are amazing. The fusion
	models are six years old. Uh, for the longest time, people were kind of ignoring them as far
	as I can tell. And, uh, they're an amazing generative model, especially in, uh, in images.
	And so stable diffusion and so on. It's all diffusion based. Uh, the fusion is new. It was
	not there and came from, well, it came from Google, but a researcher could have come up with it. In
	fact, some of the first actually know those came from Google as well. Uh, but a researcher could
	come up with that in an academic institution. Yeah. What do you find most fascinating about
	diffusion models? So from the societal impact of the technical architecture, what I like about
	the fusion is it works so well. Was that surprising to you? The amount of the variety, almost the
	novelty of the synthetic data is generating. Yeah. So the stable diffusion images are incredible.
	It's the speed of improvement in generating images has been insane. Uh, we went very quickly
	from generating like tiny digits to tiny faces and it all looked messed up. And now we were stable
	diffusion and that happened very quickly. There's a lot that academia can still contribute. Uh,
	you know, for example, um, flash attention is a very efficient kernel for running the attention
	operation inside the transformer that came from academic environment. It's a very clever way to
	structure the kernel, uh, that do the best calculation. So it doesn't materialize the
	attention matrix. Um, and so there's, I think there's still like lots of things to contribute,
	but you have to be just more strategic. Do you think neural networks can be made to reason?
	Uh, yes. Do you think they already reason? Yes. What's your definition of reasoning? Uh,
	information processing.
	So in the way that humans think through a problem and come up with novel ideas,
	it, it feels like reasoning. Yeah. So the, the novelty,
	I don't want to say, but out of, out of distribution ideas, you think it's possible?
	Yes. And I think we're seeing that already in the current neural nets. You're able to remix the
	training set information into true generalization in some sense. That doesn't appear in a fundamental
	way in the training set. Like you're doing something interesting algorithmically, you're
	manipulating, you know, some symbols and you're coming up with some correct, unique answer in a
	new setting. What would, uh, illustrate to you, holy shit, this thing is definitely thinking.
	To me, thinking or reasoning is just information processing and generalization. And I think the
	neural nets already do that today. So being able to perceive the world or perceive the,
	whatever the inputs are and to make predictions based on that or actions based on that, that's,
	that's the reason. Yeah. You're giving correct answers in novel settings, uh, by manipulating
	information. You've learned the correct algorithm. You're not doing just some kind of a lookup table
	on the Earth's neighbor search. Something like that. Let me ask you about AGI. What, what are some
	moonshot ideas you think might make significant progress towards AGI? Or maybe another way is
	what are the big blockers that we're missing now? So basically I am fairly bullish on our ability to
	build AGI's, uh, basically automated systems that we can interact with that are very human-like
	and we can interact with them in the digital realm or physical realm. Currently, it seems
	most of the models that sort of do these sort of magical tasks are in a text realm. Um, I think,
	as I mentioned, I'm suspicious that the text realm is not enough to actually build full
	understanding of the world. I do actually think you need to go into pixels and understand the
	physical world and how it works. So I do think that we need to extend these models to consume
	images and videos and train on a lot more data that is multimodal in that way. Do you think you
	need to touch the world to understand it also? Well, that's the big open question I would say
	in my mind is if you also require the embodiment and the ability to, uh, sort of, sort of interact
	with the world, run experiments and, um, have a data of that form, then you need to go to optimist
	or something like that. And so I would say optimist in some way is like a hedge, um,
	in AGI because it seems to me that it's possible that just having data from the internet is not
	enough. If that is the case, then optimist may lead to AGI, uh, because optimist would, I, to me,
	there's nothing beyond optimist. You have like this humanoid form factor that can actually like
	do stuff in the world. You can have millions of them interacting with humans and so on. And, uh,
	if that doesn't give rise to AGI at some point, like I'm not sure what will. Um, so from a
	completeness perspective, I think that's the, uh, that's a really good platform, but it's a much
	harder platform because, uh, you are dealing with atoms and you need to actually like build these
	things and integrate them into society. So I think that path takes longer, uh, but it's much
	more certain. And then there's a path of the internet and just like training these compression
	models effectively, uh, on a trend compress all the internet. And, uh, that might also give, um,
	these agents as well. Compress the internet, but also interact with the internet. So it's not
	obvious to me. In fact, I suspect you can reach AGI without ever entering the physical world.
	And the, which is a little bit more, uh, concerning because it might, that results in it happening
	faster. So it just feels like we're in, like in boiling water. We won't know as it's happening.
	I would like to, I'm not afraid of AGI. I'm excited about it. There's always concerns,
	but I would like to know when it happens. Yeah. Or it have like hints about when it happens, like
	a year from now, it will happen. That kind of thing. I just feel like in the digital realm,
	it just might happen. Yeah. I think all we have available to us because no one has built AGI
	again. So all we have available to us is, uh, is there enough fertile ground on the periphery?
	I would say yes. And we have the progress so far, which has been very rapid and, uh, there are next
	steps that are available. And so I would say, uh, yeah, it's quite likely that we'll be interacting
	with digital entities. How will you know that somebody has built AGI? It's going to be a slow,
	I think it's going to be a slow incremental transition is going to be product based and
	focused. It's going to be GitHub co-pilot getting better. And then, uh, GPT is helping you right.
	And then these oracles that you can go to with mathematical problems, I think we're on a,
	on a verge of being able to ask very complex questions in chemistry, physics, math,
	of these oracles and have them complete solutions. So AGI to use primarily focused on intelligence.
	So consciousness doesn't enter into, uh, into it. So in my mind, consciousness is not a special
	thing you will, you will figure out and bolt on. I think it's an emerging phenomenon of a
	large enough and complex enough, um, generative model sort of. So, um, if you have a complex
	enough world model, uh, that understands the world, then it also understands its predicament
	in the world as being a language model, which to me is a form of consciousness or self-awareness.
	And so in order to understand the world deeply, you probably have to integrate yourself into the
	world. And in order to interact with humans and other living beings, consciousness is a very
	useful tool. I think consciousness is like a modeling insight. Modeling insight. Yeah. It's a,
	you have a powerful enough model of understanding the world that you actually understand that you
	are an entity in it. Yeah. But there's also this, um, perhaps just the narrative we tell ourselves.
	There's a, it feels like something to experience the world, the hard problem of consciousness,
	but that could be just a narrative that we tell ourselves. Yeah. I don't think we'll,
	yeah, I think it will emerge. I think it's going to be something very boring. Like we'll be talking
	to these digital AIs, they will claim they're conscious. They will appear conscious. They will
	do all the things that you would expect of other humans. And, uh, it's going to just be a stalemate.
	I think there'll be a lot of actual fascinating ethical questions, like Supreme Court level
	questions of whether you're allowed to turn off a conscious AI. If you're allowed to build a
	conscious AI, maybe there would have to be the same kind of debate that you have around
	um, sorry to bring up a political topic, but you know, abortion, uh, which is the deeper question
	with abortion, uh, is what is life? And the deep question with AI is also what is life and what is
	conscious? And I think that'll be very fascinating to bring up. It might become illegal to build
	systems that are capable like of such level of intelligence that consciousness would emerge.
	And therefore the capacity to suffer would emerge and somebody, a system that says, no,
	please don't kill me. Well, that's what the Lambda compute, the Lambda chatbot already told,
	um, this Google engineer, right? Like it was talking about not wanting to die or so on.
	So that might become illegal to do that. Right.
	Cause otherwise you might have a lot of, a lot of creatures that don't want to die
	and they will, uh, you can just spawn infinity of them on a cluster.
	And then that might lead to like horrible consequences. Cause then there might be a lot
	of people that secretly love murder and then we'll start practicing murder on those systems.
	I mean, there's just, I, to me, all of this stuff just brings a beautiful mirror to the human
	condition and human nature. We'll get to explore it. And that's what like the best of, uh, the
	Supreme court of all the different debates we have about ideas of what it means to be human.
	We get to ask those deep questions that we've been asking throughout human history.
	There's always been the other in human history. Uh, we're the good guys and that's the bad guys.
	And we're going to, uh, you know, throughout human history, let's murder the bad guys.
	And the same will probably happen with robots. It'll be the other at first. And then we'll get
	to ask questions of what does it mean to be alive? What does it mean to be conscious?
	Yep. And I think there's some canary in the coal mines, even with what we have today.
	And, uh, you know, for example, these, there's these like waifus that you can like work with.
	And some people are trying to like, this company is going to shut down, but this person really like,
	love their waifu and like, it's trying to like port it somewhere else. And like, it's not possible.
	And like, I think like definitely, uh, people will have feelings towards, uh, towards these,
	um, systems because in some sense they are like a mirror of humanity because they are like sort of
	like a big average of humanity in a way that it's trained. But we can, that average,
	we can actually watch. There's, it's nice to be able to interact with the big average of humanity
	and do like a search query on it. Yeah. Yeah. It's very fascinating. And, uh, we can of course,
	also like shape it. It's not just a pure average. We can mess with the training data. We can mess
	with the objective. We can fine tune them in various ways. Uh, so we have some, um, you know,
	impact on what those systems look like. If you want to achieve AGI, um, and you could, uh, have
	a conversation with her and ask her, uh, talk about anything, maybe ask her a question. What,
	what kind of stuff would you, would you ask? I would have some practical questions in my mind,
	like, uh, do I or my loved ones really have to die? Uh, what can we do about that?
	Do you think it will answer clearly or would it answer poetically?
	I would expect it to give solutions. I would expect it to be like, well, I've read all of
	these textbooks and I know all these things that you've produced. And it seems to me like,
	here are the experiments that I think it would be useful to run next. And here's some gene
	therapies that I think would be helpful. And, uh, here are the kinds of experiments that you should
	run. Okay. Let's go with this thought experiment. Okay. Imagine that mortality is actually, uh,
	pre like a prerequisite for happiness. So if we become immortal, we'll actually become deeply
	unhappy and the model is able to know that. So what is this supposed to tell you? Stupid human
	about it. Yes, you can become a mortal, but you will become deeply unhappy. If, if the models,
	if the AGI system is trying to empathize with you human, what is this supposed to tell you that?
	Yes, you don't have to die, but you're really not going to like it. Is that, is it going to be
	deeply honest? Like there's a interstellar. What is it? The AI says like humans want 90% honesty.
	Yeah. So like you have to pick how honest do I want to answer these practical questions?
	Yeah. I love AI and interstellar by the way. I think it's like such a sidekick to the entire story,
	but at the same time, it's like really interesting. It's kind of limited in certain ways,
	right? Yeah, it's limited. And I think that's totally fine by the way. I don't think, uh,
	I think it's fine and plausible to have a limited and imperfect AGI.
	Is that the feature almost as an example, like it has a fixed amount of compute on its physical
	body. And it might just be that even though you can have a super amazing mega brain,
	super intelligent AI, you also can have like, you know, less intelligent as they can deploy
	in a power efficient way. And then they're not perfect. They might make mistakes.
	No, I meant more like say you had infinite compute and it's still good to make mistakes sometimes
	to integrate yourself. Like, um, what is it going back to goodwill hunting? Uh,
	Robin Williams character says like the human imperfections, that's the good stuff, right?
	Isn't it, isn't that the S like we don't want perfect. We want flaws in part to,
	to form connections with each other because it feels like something you can attach your feelings
	to the, the, the flaws in that same way. You want AI that's flawed. I don't know. I feel like
	perfectionist, but then you're saying, okay, yeah, but that's not AGI, but see AGI would need to be
	intelligent enough to give answers to humans that humans don't understand. And I think perfect isn't
	something humans can't understand because even science doesn't give perfect answers. There's
	always gabs and mysteries and I don't know. I, I don't know if humans want perfect.
	Yeah. I could imagine just, um, having a conversation with this kind of oracle entity
	as you'd imagine them. And, uh, yeah, maybe it can tell you about, you know, based on my analysis of
	human condition, um, you might not want this and here are some of the things that might,
	but every, every dumb human will say, yeah, yeah, yeah, yeah. Trust me. I can give me the truth. I
	can handle it, but that's the beauty. Like people can choose. Uh, so, but then
	it's the old marshmallow test with the kids and so on. I feel like too many people,
	like can't handle the truth, probably including myself, like the deep truth of the human
	condition. I don't, I don't know if I can handle it. Like, what if there's some dark stuff? What,
	what if we are an alien science experiment and it realizes that what if it had, I mean,
	I mean, this is the matrix, you know, all over again.
	I don't know. I would, what would I talk about? I don't even, yeah, I, uh, probably I will go
	with the safer scientific questions at first that have nothing to do with my own personal life and
	mortality, just like about physics and so on, uh, to, to build up, like, let's see where it's at,
	or maybe see if it has a sense of humor. That's another question. Would it be able to, uh,
	presumably in order to, if it understands humans deeply, it would be able to generate, uh,
	yeah, to generate humor. Yeah. I think that's actually a wonderful benchmark almost. Like,
	is it able, I think that's a really good point basically to make you laugh. Yeah. If it's able
	to be like a very effective standup comedian, that is doing something very interesting computationally.
	I think being funny is extremely hard. Yeah. Because it's hard in a way, like a touring test,
	the original intent of the touring test is hard because you have to convince humans and there's
	nothing that's why, that's why comedians talk about this. Like there's, this is deeply honest
	because if people can't help but laugh and if they don't laugh, that means you're not funny.
	If they laugh, it's funny. And you're showing, you need a lot of knowledge to create, to create
	humor about like the documentation, human condition and so on. And then you need to be clever with it.
	Uh, you mentioned a few movies you tweeted movies that I've seen five plus times, but
	I'm ready and willing to keep watching interstellar gladiator contact goodwill hunting,
	the matrix, Lord of the rings, all three avatar fifth elements. So on and goes on terminated to
	mean girls. I'm not going to ask about that. I think her man girls is great. Um, what are some
	that jump out to your memory that you love and why you mentioned the matrix
	as a computer person, why do you love the matrix? There's so many properties that make it like
	beautiful and interesting. So, uh, there's all these philosophical questions, but then there was
	also a GIs and there's a simulation and it's cool. And there's, you know, the black, uh, you know,
	the look of it, the feel of it, the feel of it, the action, the bullet time. It was just like
	innovating in so many ways. And then, uh, goodwill goodwill hunting. Why do you like that one?
	Yeah, I just, I really like this, uh, tortured genius sort of character who's like grappling
	with whether or not he has like any responsibility or like what to do with this gift that he was
	given or like how to think about the whole thing. And, uh, there's also a dance between the genius
	and the, the personal, like what it means to love another human being. And there's a lot of things
	there. It's just a beautiful movie. And then the fatherly figure, the mentor in the, in the
	psychiatrist and the, it like really like, uh, it messes with you. You know, there's some movies
	that just like really mess with you, uh, on a deep level. Do you relate to that movie at all?
	No, it's not your fault. As I said, Lord of the Rings, that's self-explanatory. Terminator two,
	which is interesting. You rewatch that a lot. Is that better than Terminator one? You like,
	I do like Terminator one as well. Uh, I like Terminator two a little bit more,
	but in terms of like its surface properties,
	do you think Skynet is at all a possibility? Uh, yes.
	Like the actual sort of, uh, autonomous, uh, weapon system kind of thing. Do you worry about that
	stuff? I do worry. I being useful war. I a hundred percent worry about it. And so the,
	I mean, the, uh, you know, some of these, uh, fears of AGI and how this will plan out, I mean,
	these will be like very powerful entities probably at some point. And so, um, for a long time,
	there are going to be tools in the hands of humans. Uh, you know, people talk about like
	alignment of AGI and how to make the problem is like even humans are not aligned. Uh, so,
	uh, how this will be used and what this is going to look like is, um, yeah, it's troubling. So.
	Do you think it'll happen slowly enough that we'll be able to,
	as a human civilization, think through the problems?
	Yes. That's my hope is that it happens slowly enough and in open enough way where a lot of
	people can see and participate in it. Just figure out how to deal with this transition. I think
	we're just going to be interesting. I draw a lot of inspiration from nuclear weapons
	because I sure thought it would be, it would be fucked once they develop nuclear weapons.
	But like, it's almost like, uh, when, uh, when the systems are not so dangerous,
	they distort human civilization. We deploy them and learn the lessons. And then we quickly
	if it's too dangerous, we'll quickly, quickly, we might still deploy it. Uh, but you very quickly
	learn not to use them. And so there'll be like this balance achieved. Humans are very clever as
	a species. It's interesting. We exploit the resources as much as we can, but we don't,
	we avoid destroying ourselves. It seems like. Well, I don't know about that actually. I hope
	it continues. Um, I mean, I'm definitely like concerned about nuclear weapons and so on,
	not just as a result of the recent conflict, even before that, uh, that's probably my number
	one concern for humanity. So if humanity, uh, destroys itself or destroys, you know, 90%
	of people that would be because of nukes. I think so. Um, and it's not even about the full
	destruction to me. It's bad enough if we reset society, that would be like terrible. It would
	be really bad. And I can't believe we're like so close to it. Yeah. It's like so crazy to me.
	It feels like we might be a few tweets away from something like that. Yep. Basically it's extremely
	unnerving, but it has been for me for a long time. It seems unstable that world leaders,
	just having a bad mood can like, um, take one step towards a bad direction and it escalates.
	Yeah. And because of a collection of bad moods, it can escalate without being able to, um, stop.
	Yeah, it's just, uh, it's a huge amount of, uh, power. And then also with the proliferation,
	I basically, I don't, I don't actually really see, I don't actually know what the good outcomes are
	here. Uh, so I'm definitely worried about that a lot. And then AGI is not currently there,
	but I think at some point will more and more become something like it. The danger with AGI
	even is that I think it's even like slightly worse in the sense that, uh, there are good outcomes of
	AGI and then the bad outcomes are like an Epsilon away, like a tiny one away. And so I think, um,
	capitalism and humanity and so on will drive for the positive, uh, ways of using that technology.
	But then if bad outcomes are just like a tiny, like flip a minus sign away, uh, that's a really
	bad position to be in a tiny perturbation of the system results in the destruction of the human
	species. So we are lying to walk. Yeah. I think in general, it's really weird about like the
	dynamics of humanity and this explosion we've talked about is just like the insane coupling
	afforded by technology and, uh, just the instability of the whole dynamical system.
	I think it's just, it doesn't look good, honestly. Yes. That explosion could be destructive and
	constructive and the probabilities are non-zero in both. Yeah. I mean, I have to, I do feel like I
	have to try to be optimistic and so on. And I think even in this case, I still am predominantly
	optimistic, but there's definitely. Me too. Uh, do you think we'll become a multi-planetary species?
	Probably yes, but I don't know if it's dominant feature of, uh, future humanity. Uh, there might
	be some people on some planets and so on, but I'm not sure if it's like, yeah, if it's like a major
	player in our culture and so on, we still have to solve the drivers of self-destruction here on earth.
	So just having a backup on Mars is not going to solve the problem. So by the way, I love the backup
	on Mars. I think that's amazing. You should absolutely do that. Yes. And I'm so thankful.
	Would you go to Mars? Uh, personally, no, I do like earth quite a lot. Okay. Uh, I'll go to Mars.
	I'll go for you. I'll tweet at you from there. Maybe eventually I would once it's safe enough,
	but I don't actually know if it's on my lifetime scale unless I can extend it by a lot.
	I do think that, for example, a lot of people might disappear into, um, virtual realities and
	stuff like that. And I think that could be the major thrust of, um, sort of the cultural
	development of humanity if it survives. Uh, so it might not be, it's just really hard to work in
	physical realm and go out there. And I think ultimately all your experiences are in your
	brain. And so it's much easier to disappear into digital realm. And I think people will
	find them more compelling, easier, safer, more interesting. So you're a little bit captivated
	by virtual reality, by the possible worlds, whether it's the metaverse or some other
	manifestation of that. Yeah. Yeah. It's really interesting. It's, uh, I'm, I'm interested just,
	just talking a lot to Carmack. Where's the, where's the thing that's currently preventing that?
	Yeah. I mean, to be clear, I think what's interesting about future is, um, it's not that
	I kind of feel like the variance in the human condition grows. That's the primary thing that's
	changing. It's not as much the mean of the distribution is like the variance of it. So
	there will probably be people on Mars and there will be people in VR and there will people here
	on earth. It's just like, there will be so many more ways of being. And so I kind of feel like
	I see it as like a spreading out of a human experience. There's something about the internet
	that allows you to discover those little groups and then you gravitate to something about your
	biology, likes that kind of world that you find each other. Yeah. And we'll have transhumanists
	and then we'll have the Amish and they're going to, everything is just going to coexist.
	You know, the cool thing about it, cause I've interacted with a bunch of internet communities
	is, um, they don't know about each other. Like you can have a very happy existence,
	just like having a very close knit community and not knowing about each other. I mean, even,
	you even sense this, just having traveled to Ukraine, there's, they, they don't know
	so many things about America. You, you like when you travel across the world,
	I think you experienced this too. There are certain cultures that are like,
	they have their own thing going on. They don't. And so you can see that happening more and more
	and more and more in the future. We have little communities. Yeah. Yeah. I think so. That seems
	to be the, that seems to be how it's going right now. And I don't see that trend like really
	reversing. I think people are diverse and they're able to choose their own path and existence.
	And I sort of like celebrate that. Um, and so- Will you spend some much time in the metaverse,
	in the virtual reality or which community area are you the physicalist, uh, the, the,
	the physical reality enjoyer or, uh, do you see drawing a lot of, uh, pleasure and fulfillment
	in the digital world? Yeah, I think, well, currently the virtual reality is not that compelling.
	I do think it can improve a lot, but I don't really know to what extent maybe, you know,
	there's actually like even more exotic things you can think about with like neural links or
	stuff like that. So, um, currently I kind of see myself as mostly a team human person. I love
	nature. I love harmony. I love people. I love humanity. I love emotions of humanity. Um, and
	I, I just want to be like in this like solar punk little utopia. That's my happy place. Yes. My happy
	place is like, uh, people I love thinking about cool problems surrounded by a lush, beautiful,
	dynamic nature and a secretly high tech in places that count places. They use technology to empower
	that love for other humans and nature. Yeah. I think a technology used like very sparingly.
	I don't love when it sort of gets in the way of humanity in many ways. Uh, I like just people
	being humans in a way we sort of like slightly evolved and prefer, I think just by default.
	People kept asking me because they, they know you love reading. Are there particular books
	that you enjoyed that had an impact on you for silly or for profound reasons that you would
	recommend? You mentioned the vital question. Many, of course, I think in biology as an example,
	the vital question is a good one. Anything by Nic Lane, really, uh, life ascending, I would say
	is like a bit more potentially, uh, representative as like a summary of a lot of the things he's been
	about. I was very impacted by the selfish gene. I thought that was a really good book that helped
	me understand altruism as an example and where it comes from. And just realizing that, you know,
	the selection is on the level of genes was a huge insight for me at the time. And it sort of like
	cleared up a lot of things for me. What do you think about the, the idea that ideas are the
	organisms, the meat? Yes, love it. A hundred percent. Are you able to walk around with that
	notion for a while that, that there is an evolutionary kind of process with ideas as well?
	There absolutely is. There's memes just like genes and they compete and they live in our brains.
	It's beautiful. Are we silly humans thinking that we're the organisms? Is it possible that the
	primary organisms are the ideas? Yeah, I would say like the, the ideas kind of live in the software
	of like our civilization in the, in the minds and so on. We think as humans that the hardware is
	the fundamental thing. I human is a hardware entity, but it could be the software, right?
	Yeah. Yeah. I would say like there needs to be some grounding at some point to like a physical
	reality. Yeah. But if we clone an Andre, the software is the thing, like is this thing that
	makes that thing special, right? Yeah, I guess you're right. But then cloning might be exceptionally
	difficult. Like there might be a deep integration between the software and the hardware in ways we
	don't quite understand. Well, from the ultimate point of view, like what makes me special is more
	like the, the gang of genes that are writing in my chromosomes, I suppose, right? Like they're the,
	they're the replicating unit, I suppose. And no, but that's just the thing that makes you special.
	Sure. Well, the reality is what makes you special is your ability to survive
	based on the software that runs on the hardware that was built by the genes.
	So the software is the thing that makes you survive, not the hardware.
	All right. It's a little bit of both. I mean, you know, it's just like a second layer. It's
	a new second layer that hasn't been there before the brain. They both, they both coexist.
	But there's also layers of the software. I mean, it's, it's not, it's a, it's a abstraction on top
	of abstractions. But, okay. So Selfish Gene and Nick Lane, I would say sometimes books are like
	not sufficient. I like to reach for textbooks sometimes. I kind of feel like books are for
	too much of a general consumption sometime. And they just kind of like, they're too high up in
	the level of abstraction and it's not good enough. So I like textbooks. I like The Cell. I think
	The Cell was pretty cool. That's why also I like the writing of Nick Lane is because he's pretty
	willing to step one level down and he doesn't, yeah, he sort of, he's willing to go there.
	But he's also willing to sort of be throughout the stack. So he'll go down to a lot of detail,
	but then he will come back up. And I think he has a, yeah, basically I really appreciate that.
	That's why I love college, early college, even high school, just textbooks on the basics.
	Of computer science, of mathematics, of biology, of chemistry. Those are, they condense down like
	it's sufficiently general that you can understand both the philosophy and the details, but also like
	you get homework problems and you get to play with it as much as you would if you were in
	programming stuff. Yeah. And then I'm also suspicious of textbooks, honestly, because
	as an example in deep learning, there's no like amazing textbooks and I feel this changing very
	quickly. I imagine the same is true in say synthetic biology and so on. These books like The Cell are
	kind of outdated. They're still high level. Like what is the actual real source of truth? It's
	people in wet labs working with cells, sequencing genomes and yeah, actually working with it. And
	I don't have that much exposure to that or what that looks like. So I still don't fully,
	I'm reading through the cell and it's kind of interesting and I'm learning, but it's still not
	sufficient I would say in terms of understanding. Well, it's a clean summarization of the mainstream
	narrative, but you have to learn that before you break out towards the cutting edge. Yeah. But what
	is the actual process of working with these cells and growing them and incubating them? And it's
	kind of like a massive cooking recipes of making sure your cells lives and proliferate and then
	you're sequencing them, running experiments and just how that works, I think is kind of like the
	source of truth of at the end of the day, what's really useful in terms of creating therapies and
	so on. Yeah. I wonder what in the future AI textbooks will be because there's artificial
	intelligence, the modern approach. I actually haven't read if it's come out the recent version,
	there's been a recent addition. I also saw there's a science, a deep learning book. I'm waiting for
	textbooks that are worth recommending, worth reading. It's tricky because it's like papers
	and code, code, code. Honestly, I find papers are quite good. I especially like the appendix of any
	paper as well. It's like the most detail you can have. It doesn't have to be cohesive connected
	to anything else. You just described me a very specific way you saw the particular thing. Yeah.
	Many times papers can be actually quite readable, not always, but sometimes the introduction and
	the abstract is readable even for someone outside of the field. This is not always true. Sometimes
	I think, unfortunately, scientists use complex terms even when it's not necessary. I think that's
	harmful. I think there's no reason for that. Papers sometimes are longer than they need to be in the
	parts that don't matter. Appendix should be long, but then the paper itself, look at Einstein,
	make it simple. Yeah, but certainly I've come across papers I would say in synthetic biology
	or something that I thought were quite readable for the abstract and the introduction. Then you're
	reading the rest of it and you don't fully understand, but you are getting a gist and I
	think it's cool. What advice, you give advice to folks interested in machine learning and research,
	but in general, life advice to a young person in high school, early college about how to have a
	career they can be proud of or a life they can be proud of? Yeah, I think I'm very hesitant to give
	general advice. I think it's really hard. I've mentioned some of the stuff I've mentioned is
	fairly general, I think. Focus on just the amount of work you're spending on a thing.
	Compare yourself only to yourself, not to others. That's good. I think those are fairly general.
	How do you pick the thing? You just have a deep interest in something or try to find the argmax
	over the things that you're interested in. Argmax at that moment and stick with it. How do you not
	get distracted and switch to another thing? You can, if you like.
	If you do an argmax repeatedly every week, every month, it's a problem.
	Yeah, you can low pass filter yourself in terms of what has consistently been true for you.
	I definitely see how it can be hard, but I would say you're going to work the hardest on the thing
	that you care about the most. Low pass filter yourself and really introspect in your past,
	what are the things that gave you energy and what are the things that took energy away from you?
	Concrete examples. Usually from those concrete examples, sometimes patterns can emerge.
	I like it when things look like this when I'm in these positions.
	That's not necessarily the field, but the kind of stuff you're doing in a particular field. For you,
	it seems like you were energized by implementing stuff, building actual things.
	Yeah, being low level, learning, and then also communicating so that others can go through the
	same realizations and shortening that gap. Because I usually have to do way too much work
	to understand a thing. Then I'm like, okay, this is actually like, okay, I think I get it.
	Why was it so much work? It should have been much less work. That gives me a lot of frustration,
	and that's why I sometimes go teach. Aside from the teaching you're doing now,
	putting out videos, aside from a potential Godfather Part II with the AGI at Tesla and beyond,
	what does the future of Ranjha Kapothi hold? Have you figured that out yet or no?
	As you see through the fog of war, that is all of our future. Do you start seeing silhouettes of
	what that possible future could look like? The consistent thing I've been always interested
	in for me at least is AI. That's probably what I'm spending the rest of my life on,
	because I just care about it a lot. I actually care about many other problems as well, like say
	aging, which I basically view as disease. I care about that as well, but I don't think it's a good
	idea to go after it specifically. I don't actually think that humans will be able to come up with the
	answer. I think the correct thing to do is to ignore those problems and you solve AI and then
	use that to solve everything else. I think there's a chance that this will work. I think it's a very
	high chance. That's the way I'm betting at least. When you think about AI, are you interested in
	all kinds of applications, all kinds of domains, and any domain you focus on will allow you to get
	insights to the big problem of AGI? Yeah, for me, it's the ultimate meta problem. I don't want to
	work on any one specific problem. There's too many problems. How can you work on all problems
	simultaneously? You solve the meta problem, which to me is just intelligence, and how do you
	automate it? Is there cool small projects like Archives Sanity and so on that you're thinking
	about that the world, the ML world can anticipate? There's always some fun side projects.
	Archives Sanity is one. Basically, there's way too many archive papers. How can I organize it
	and recommend papers and so on? I transcribed all of your podcasts. What did you learn from that
	experience from transcribing the process of, you like consuming audiobooks and podcasts and so on.
	Here's a process that achieves closer to human level performance and annotation.
	Yeah. Well, I definitely was surprised that transcription with OpenAI's Whisper was
	working so well compared to what I'm familiar with from Siri and a few other systems, I guess.
	It works so well. That's what gave me some energy to try it out. I thought it could be fun to run
	on podcasts. It's not obvious to me why Whisper is so much better compared to anything else,
	because I feel like there should be a lot of incentive for a lot of companies to produce
	transcription systems and that they've done so over a long time. Whisper is not a super exotic
	model. It's a transformer. It takes smell spectrograms and just outputs tokens of text. It's
	not crazy. The model and everything has been around for a long time. I'm not actually 100%
	sure why this game model. Yeah, it's not obvious to me either. It makes me feel like I'm missing
	something. I'm missing something. Yeah, because there is a huge, even Google and so on YouTube
	transcription. Yeah. Yeah, it's unclear, but some of it is also integrating into a bigger system.
	That is the user interface, how it's deployed and all that kind of stuff. Maybe
	running it as an independent thing is much easier, like an order of magnitude easier than deploying
	into a large integrated system like YouTube transcription or anything like meetings. Zoom
	has transcription that's kind of crappy, but creating an interface where it detects the
	different individual speakers, it's able to display it in compelling ways, run it in real time,
	all that kind of stuff. Maybe that's difficult. That's the only explanation I have because
	I'm currently paying quite a bit for human transcription and human captions annotation.
	It seems like there's a huge incentive to automate that. Yeah. It's very confusing.
	I think, I mean, I don't know if you looked at some of the whisper transcripts, but they're
	quite good. They're good. Especially in tricky cases. I've seen
	Whisper's performance on super tricky cases and it does incredibly well. I don't know. A podcast
	is pretty simple. It's like high quality audio and you're speaking usually pretty clearly.
	So I don't know. I don't know what OpenAI's plans are either.
	Yeah. There's always like fun projects basically. StableDiffusion also is opening up a huge amount
	of experimentation, I would say in the visual realm and generating images and videos and movies.
	Videos now. That's going to be pretty crazy. That's going to almost certainly work and is
	going to be really interesting when the cost of content creation is going to fall to zero.
	You used to need a painter for a few months to paint a thing and now it's going to be speak to
	your phone to get your video. Hollywood will start using that to generate scenes,
	which completely opens up. Yeah. So you can make a movie like Avatar eventually for under a million
	dollars. Much less. Maybe just by talking to your phone. I mean, I know it sounds kind of crazy.
	And then there'd be some voting mechanism. Would there be a show on Netflix as
	generated completely automatically? Yeah, potentially. Yeah. And what does it look
	like also when you can just generate it on demand and there's infinity of it?
	Yeah. Oh man. All the synthetic art. I mean, it's humbling because we treat ourselves as special
	for being able to generate art and ideas and all that kind of stuff. If that can be done in an
	automated way by AI. Yeah. I think it's fascinating to me how these, the predictions of AI and what
	is going to look like and what it's going to be capable of are completely inverted and wrong.
	And sci-fi of 50s and 60s was just like totally not right. They imagined AI as like super
	calculating theory approvers and we're getting things that can talk to you about emotions.
	They can do art. It's just like weird. Are you excited about that future? Just
	AI's like hybrid systems, heterogeneous systems of humans and AI's talking about emotions,
	Netflix and children, AI system where the Netflix thing you watch is also generated by AI.
	I think it's going to be interesting for sure. And I think I'm cautiously optimistic, but it's
	not obvious. Well, the sad thing is your brain and mine developed in a time where before Twitter,
	before the internet. So I wonder people that are born inside of it might have a different
	experience. Like I, maybe you can, will still resist it. And the people born now will not.
	Well, I do feel like humans are extremely malleable. Yeah. And you're probably right.
	What is the meaning of life, Andre? We talked about sort of the universe having a conversation
	with us humans or with the systems we create to try to answer for the universe,
	for the creator of the universe to notice us. We're trying to create systems that are loud enough
	to answer back. I don't know if that's the meaning of life. That's like meaning of life for some
	people. The first level answer I would say is anyone can choose their own meaning of life
	because we are a conscious entity and it's beautiful. Number one. But I do think that
	like a deeper meaning of life as someone is interested is along the lines of like,
	what the hell is all this and like, why? And if you look at the inter fundamental physics
	and the quantum field theory and the standard model, they're like very complicated. And
	there's this like 19 free parameters of our universe and like, what's going on with all
	this stuff and why is it here? And can I hack it? Can I work with it? Is there a message for me?
	Am I supposed to create a message? And so I think there's some fundamental answers there
	but I think there's actually even like, you can't actually like really make dent in those
	without more time. And so to me also there's a big question around just getting more time honestly.
	Yeah, that's kind of like what I think about quite a bit as well.
	So kind of the ultimate, or at least first way to sneak up to the why question is to try to escape
	the system, the universe. And then for that, you sort of backtrack and say, okay, for that,
	that's going to be take a very long time. So the why question boils down from an engineering
	perspective to how do we extend? Yeah, I think that's the question number one, practically
	speaking, because you can't, you're not going to calculate the answer to the deeper questions
	in time you have. And that could be extending your own lifetime or extending just the lifetime of
	human civilization of whoever wants to not many people might not want that. But I think people
	who do want that, I think, I think it's probably possible. And I don't think I don't know that
	people fully realize this, I kind of feel like people think of death as an inevitability. But
	at the end of the day, this is a physical system, some things go wrong. It makes sense why
	things like this happen, evolutionary speaking. And there's most certainly interventions that
	mitigate it. That'd be interesting if death is eventually looked at as, as a fascinating thing
	that used to happen to humans. I don't think it's unlikely. I think it's, I think it's likely.
	And it's up to our imagination to try to predict what the world without death looks like.
	Yeah, it's hard to, I think the values will completely change.
	Could be. I don't, I don't really buy all these ideas that, oh, without death, there's no meaning,
	there's nothing as I don't intuitively buy all those arguments. I think there's plenty of meaning,
	plenty of things to learn. They're interesting, exciting, I want to know, I want to calculate,
	I want to improve the condition of all the humans and organisms that are alive.
	Yeah, the way we find meaning might change. We, there is a lot of humans, probably including
	myself, that finds meaning in the finiteness of things. But that doesn't mean that's the
	only source of meaning. Yeah. I do think many people will, will go with that, which I think
	is great. I love the idea that people can just choose their own adventure. Like you, you are
	born as a conscious free entity by default, I'd like to think. And you have your unalienable
	rights for life. In the pursuit of happiness. I don't know if you have that in the nature,
	the landscape of happiness. You can choose your own adventure mostly. And that's not,
	it's not fully true, but I still am pretty sure I'm an NPC, but an NPC can't know it's an NPC.
	Hmm. There could be different degrees and levels of consciousness. I don't think there's a more
	beautiful way to end it. Andre, you're an incredible person. I'm really honored you
	would talk with me. Everything you've done for the machine learning world, for the AI world,
	to just inspire people, to educate millions of people has been, it's been great. And I can't
	wait to see what you do next. It's been an honor, man. Thank you so much for talking today.
	Awesome. Thank you. Thanks for listening to this conversation
	with Andre Karpathy. To support this podcast, please check out our sponsors in the description.
	And now let me leave you with some words from Samuel Carlin. The purpose of models is not to
	fit the data, but to sharpen the questions. Thanks for listening and hope to see you next time.