A Bayesian Approach to Robust Reinforcement Learning

Esther Derman, Daniel Mankowitz, Timothy Mann, Shie Mannor

Research output: Contribution to journalConference articlepeer-review

Abstract

Robust Markov Decision Processes (RMDPs) intend to ensure robustness with respect to changing or adversarial system behavior. In this framework, transitions are modeled as arbitrary elements of a known and properly structured uncertainty set and a robust optimal policy can be derived under the worst-case scenario. In this study, we address the issue of learning in RMDPs using a Bayesian approach. We introduce the Uncertainty Robust Bellman Equation (URBE) which encourages safe exploration for adapting the uncertainty set to new observations while preserving robustness. We propose a URBE-based algorithm, DQN-URBE, that scales this method to higher dimensional domains. Our experiments show that the derived URBE-based strategy leads to a better trade-off between less conservative solutions and robustness in the presence of model misspecification. In addition, we show that the DQN-URBE algorithm can adapt significantly faster to changing dynamics online compared to existing robust techniques with fixed uncertainty sets.

Original languageEnglish
Pages (from-to)648-658
Number of pages11
JournalProceedings of Machine Learning Research
Volume115
StatePublished - 2019
Event35th Uncertainty in Artificial Intelligence Conference, UAI 2019 - Tel Aviv, Israel
Duration: 22 Jul 201925 Jul 2019

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'A Bayesian Approach to Robust Reinforcement Learning'. Together they form a unique fingerprint.

Cite this