Optimistic exploration - Alex's notes

Optimistic exploration

Optimistic exploration is a way of choosing actions in Q-learning . Here we set the initial values of our $\hat{Q}$ to all be very high and we always pick the action that maximises $\hat{Q}$. Then in uncertainty it will explore actions it does not know about.

# Optimistic exploration