Meta Learning the Step Size in Policy Gradient Methods
Authors Luca Sabbioni, Francesco Corda, Marcello Restelli Abstract Policy-based algorithms are among the most widely adopted techniques in model-free RL, thanks to their strong theoretical groundings and good properties in continuous action spaces. Unfortunately, these methods require precise and problem-specific hyperparameter tuning to achieve good performance and, as a consequence, they tend to struggle when […]