Chapter 4: Dynamic Programming
Chapter 4 of RL book
Policy Evaluation (Prediction) : First we consider how to compute the state-value function vπ for an arbitrary policy π. This is called policy evaluation in the DP literature. We also refer to it as the prediction problem.
in this we find the value function of all states for given policy having knowing all environment dynamics.
Read the book itself for policy imporvment and policy iteration.
Last updated