Bellman equation

reinforcement-learning
Bellman equation

The Bellman equation is used to determine the optimum value function for a given Markov decision process. It defines this value function recursively as follows:

$$ >V(s) = \max_{a \in A_s} \left ( R(s,a) + \gamma \sum_{s' \in S} T(s,a,s') V(s') \right ) >$$