Vanilla DQN – Breakout

Trained on ALE/Breakout-v5 using Vanilla DQN with vectorised environments.

Environment

Property	Value
Environment	ALE/Breakout-v5
State space	4 × 84 × 84 stacked grayscale frames
Action space	4 discrete (NOOP, FIRE, RIGHT, LEFT)
Frameskip	4

The target network evaluates both action selection and value estimation:

next_q   = q_target(next_state).max()     # target picks AND evaluates
target_q = reward + γ * (1 - done) * next_q

import torch
model = DQN()
model.load_state_dict(torch.load('best_breakout.pt')['model'])
model.eval()

Downloads last month: -; Downloads are not tracked for this model. How to track

Video Preview