Chapter 4: Dynamic Programming

Chapter 4 of RL book

  • Policy Evaluation (Prediction) : First we consider how to compute the state-value function vÏ€ for an arbitrary policy Ï€. This is called policy evaluation in the DP literature. We also refer to it as the prediction problem.

  • in this we find the value function of all states for given policy having knowing all environment dynamics.

  • Read the book itself for policy imporvment and policy iteration.

Last updated