Safe Policy Iteration: A Monotonically Improving Approximate Policy Iteration Approach
Authors Alberto Maria Metelli, Matteo Pirotta, Daniele Calandriello, Marcello Restelli Abstract This paper presents a study of the policy improvement step that can be usefully exploited by approximate policy–iteration algorithms. When either the policy evaluation step or the policy improvement step returns an approximated result, the sequence of policies produced by policy iteration may not […]