XAI and Strategy Extraction via Reward Redistribution

Marius-Constantin Dinu, Markus Hofmarcher, Vihang P. Patil, Matthias Dorfer, Patrick M. Blies, Johannes Brandstetter, Jose A. Arjona-Medina, Sepp Hochreiter

Research output: Conference proceeding/Chapter in Book/Report/Chapterpeer-review

Abstract

In reinforcement learning, an agent interacts with an environment from which it receives rewards, that are then used to learn a task. However, it is often unclear what strategies or concepts the agent has learned to solve the task. Thus, interpretability of the agent's behavior is an important aspect in practical applications, next to the agent's performance at the task itself. However, with the increasing complexity of both tasks and agents, interpreting the agent's behavior becomes much more difficult. Therefore, developing new interpretable RL agents is of high importance. To this end, we propose to use Align-RUDDER as an interpretability method for reinforcement learning. Align-RUDDER is a method based on the recently introduced RUDDER framework, which relies on contribution analysis of an LSTM model, to redistribute rewards to key events. From these key events a strategy can be derived, guiding the agent's decisions in order to solve a certain task. More importantly, the key events are in general interpretable by humans, and are often sub-tasks; where solving these sub-tasks is crucial for solving the main task. Align-RUDDER enhances the RUDDER framework with methods from multiple sequence alignment (MSA) to identify key events from demonstration trajectories. MSA needs only a few trajectories in order to perform well, and is much better understood than deep learning models such as LSTMs. Consequently, strategies and concepts can be learned from a few expert demonstrations, where the expert can be a human or an agent trained by reinforcement learning. By substituting RUDDER's LSTM with a profile model that is obtained from MSA of demonstration trajectories, we are able to interpret an agent at three stages: First, by extracting common strategies from demonstration trajectories with MSA. Second, by encoding the most prevalent strategy via the MSA profile model and therefore explaining the expert's behavior. And third, by allowing the interpretation of an arbitrary agent's behavior based on its demonstration trajectories.
Original languageEnglish
Title of host publicationxxAI - Beyond Explainable AI: International Workshop, Held in Conjunction with ICML 2020, July 18, 2020, Vienna, Austria, Revised and Extended Papers
EditorsAndreas Holzinger, Randy Goebel, Ruth Fong, Taesup Moon, Klaus-Robert Müller, Wojciech Samek
Place of PublicationCham
Pages177-205
Number of pages29
DOIs
Publication statusPublished - 2022
Externally publishedYes

Keywords

  • Explainable AI
  • Contribution Analysis
  • Reinforcement Learning
  • Credit Assignment
  • Reward Redistribution

Fingerprint

Dive into the research topics of 'XAI and Strategy Extraction via Reward Redistribution'. Together they form a unique fingerprint.

Cite this