Bellman Equations
For Markov Decision Processes (MDPs), the Bellman equations are as follows.
- Bellman equation for state-value function (3.14 in (Sutton and Barto 2018))
- Bellman equation for action-value function
- Bellman optimality equation for state-value function (special case of bellman eq for state-value functions for the optimal value function which does not depend on a generic policy pi, p63 (Sutton and Barto 2018))
- Bellman optimality equation for action-value function
Thoughts
- Solving MDPs via the Bellman equations assume that we know the full environment dynamics, we have computational resources and the Markov property, p66 (Sutton and Barto 2018).