AAAI 2021

AAAI-Logo

Policy Optimization as Online Learning with Mediator Feedback

Policy Optimization as Online Learning with Mediator Feedback Authors: Alberto Maria Metelli, Matteo Papini, Pierluca D’Oro, Marcello Restelli Conference: AAAI 2021 Abstract: Policy Optimization (PO) is a widely used approach to address continuous control tasks. In this paper, we introduce the notion of mediator feedback that frames PO as an online learning problem over the […]
Read More

Task-Agnostic Exploration via Policy Gradient of a Non-Parametric State Entropy Estimate

Task-Agnostic Exploration via Policy Gradient of a Non-Parametric State Entropy Estimate Authors: Mirco Mutti, Lorenzo Pratissoli, Marcello Restelli Conference: AAAI 2021 Abstract: In a reward-free environment, what is a suitable intrinsic objective for an agent to pursue so that it can learn an optimal task-agnostic exploration policy? In this paper, we argue that the entropy […]
Read More
AAAI-Logo

Newton Optimization on Helmholtz Decomposition for Continuous Games

Newton Optimization on Helmholtz Decomposition for Continuous Games Authors: Giorgia Ramponi, Marcello Restelli Conference: AAAI 2021 Abstract: Many learning problems involve multiple agents optimizing different interactive functions. In these problems, the standard policy gradient algorithms fail due to the non-stationarity of the setting and the different interests of each agent. In fact, algorithms must take […]
Read More
AAAI-Logo

Policy Optimization as Online Learning with Mediator Feedback

Policy Optimization as Online Learning with Mediator Feedback Authors: Alberto Maria Metelli, Matteo Papini, Pierluca D’Oro, Marcello Restelli Conference: AAAI 2021 Abstract: Policy Optimization (PO) is a widely used approach to address continuous control tasks. In this paper, we introduce the notion of mediator feedback that frames PO as an online learning problem over the […]
Read More

Task-Agnostic Exploration via Policy Gradient of a Non-Parametric State Entropy Estimate

Task-Agnostic Exploration via Policy Gradient of a Non-Parametric State Entropy Estimate Authors: Mirco Mutti, Lorenzo Pratissoli, Marcello Restelli Conference: AAAI 2021 Abstract: In a reward-free environment, what is a suitable intrinsic objective for an agent to pursue so that it can learn an optimal task-agnostic exploration policy? In this paper, we argue that the entropy […]
Read More
AAAI-Logo

Learning Probably Approximately Correct Maximin Strategies in Games with Infinite Strategy Spaces

Learning Probably Approximately Correct Maximin Strategies in Games with Infinite Strategy Spaces Authors: Alberto Marchesi, Francesco Trovò, Nicola Gatti Conference: AAAI 2021 Abstract: We tackle the problem of learning equilibria in simulationbased games. In such games, the players’ utility functions cannot be described analytically, as they are given through a black-box simulator that can be […]
Read More
AAAI-Logo

Online Learning in Non-Cooperative Configurable Markov Decision Process

Online Learning in Non-Cooperative Configurable Markov Decision Process Authors: Giorgia Ramponi, Alberto Maria Metelli, Alessandro Concetti, Marcello Restelli Conference: AAAI 2021 Abstract: In the Configurable Markov Decision Processes there are two entities, a Reinforcement Learning agent and a configurator which can modify some parameters of the environment to improve the performance of the agent. What […]
Read More
AAAI-Logo

Learning Probably Approximately Correct Maximin Strategies in Games with Infinite Strategy Spaces

Learning Probably Approximately Correct Maximin Strategies in Games with Infinite Strategy Spaces Authors: Alberto Marchesi, Francesco Trovò, Nicola Gatti Conference: AAAI 2021 Abstract: We tackle the problem of learning equilibria in simulationbased games. In such games, the players’ utility functions cannot be described analytically, as they are given through a black-box simulator that can be […]
Read More
AAAI-Logo

Online Learning in Non-Cooperative Configurable Markov Decision Process

Online Learning in Non-Cooperative Configurable Markov Decision Process Authors: Giorgia Ramponi, Alberto Maria Metelli, Alessandro Concetti, Marcello Restelli Conference: AAAI 2021 Abstract: In the Configurable Markov Decision Processes there are two entities, a Reinforcement Learning agent and a configurator which can modify some parameters of the environment to improve the performance of the agent. What […]
Read More