Deep RL Course documentation

What is Reinforcement Learning?

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

What is Reinforcement Learning?

To understand Reinforcement Learning, let’s start with the big picture.

The big picture

The idea behind Reinforcement Learning is that an agent (an AI) will learn from the environment by interacting with it (through trial and error) and receiving rewards (negative or positive) as feedback for performing actions.

Learning from interactions with the environment comes from our natural experiences.

For instance, imagine putting your little brother in front of a video game he never played, giving him a controller, and leaving him alone.

Illustration_1

Your brother will interact with the environment (the video game) by pressing the right button (action). He got a coin, that’s a +1 reward. It’s positive, he just understood that in this game he must get the coins.

Illustration_2

But then, he presses the right button again and he touches an enemy. He just died, so that’s a -1 reward.

Illustration_3

By interacting with his environment through trial and error, your little brother understands that he needs to get coins in this environment but avoid the enemies.

Without any supervision, the child will get better and better at playing the game.

That’s how humans and animals learn, through interaction. Reinforcement Learning is just a computational approach of learning from actions.

A formal definition

We can now make a formal definition:

Reinforcement learning is a framework for solving control tasks (also called decision problems) by building agents that learn from the environment by interacting with it through trial and error and receiving rewards (positive or negative) as unique feedback.

But how does Reinforcement Learning work?

< > Update on GitHub