Bellman equation
reinforcement-learning
Bellman equation
The Bellman equation is used to determine the optimum value function for a given Markov decision process. It defines this value function recursively as follows:
$$ >V(s) = \max_{a \in A_s} \left ( R(s,a) + \gamma \sum_{s' \in S} T(s,a,s') V(s') \right ) >$$