XAI and Strategy Extraction via Reward Redistribution

Marius-Constantin Dinu; Markus Hofmarcher; Vihang P. Patil; Matthias Dorfer; Patrick M. Blies; Johannes Brandstetter; Jose A. Arjona-Medina; Sepp Hochreiter

doi:10.1007/978-3-031-04083-2_10

XAI and Strategy Extraction via Reward Redistribution

Marius-Constantin Dinu, Markus Hofmarcher, Vihang P. Patil, Matthias Dorfer, Patrick M. Blies, Johannes Brandstetter, Jose A. Arjona-Medina, Sepp Hochreiter

Research output: Conference proceeding/Chapter in Book/Report/ › Chapter › peer-review

Abstract

In reinforcement learning, an agent interacts with an environment from which it receives rewards, that are then used to learn a task. However, it is often unclear what strategies or concepts the agent has learned to solve the task. Thus, interpretability of the agent's behavior is an important aspect in practical applications, next to the agent's performance at the task itself. However, with the increasing complexity of both tasks and agents, interpreting the agent's behavior becomes much more difficult. Therefore, developing new interpretable RL agents is of high importance. To this end, we propose to use Align-RUDDER as an interpretability method for reinforcement learning. Align-RUDDER is a method based on the recently introduced RUDDER framework, which relies on contribution analysis of an LSTM model, to redistribute rewards to key events. From these key events a strategy can be derived, guiding the agent's decisions in order to solve a certain task. More importantly, the key events are in general interpretable by humans, and are often sub-tasks; where solving these sub-tasks is crucial for solving the main task. Align-RUDDER enhances the RUDDER framework with methods from multiple sequence alignment (MSA) to identify key events from demonstration trajectories. MSA needs only a few trajectories in order to perform well, and is much better understood than deep learning models such as LSTMs. Consequently, strategies and concepts can be learned from a few expert demonstrations, where the expert can be a human or an agent trained by reinforcement learning. By substituting RUDDER's LSTM with a profile model that is obtained from MSA of demonstration trajectories, we are able to interpret an agent at three stages: First, by extracting common strategies from demonstration trajectories with MSA. Second, by encoding the most prevalent strategy via the MSA profile model and therefore explaining the expert's behavior. And third, by allowing the interpretation of an arbitrary agent's behavior based on its demonstration trajectories.

Original language	English
Title of host publication	xxAI - Beyond Explainable AI: International Workshop, Held in Conjunction with ICML 2020, July 18, 2020, Vienna, Austria, Revised and Extended Papers
Editors	Andreas Holzinger, Randy Goebel, Ruth Fong, Taesup Moon, Klaus-Robert Müller, Wojciech Samek
Place of Publication	Cham
Pages	177-205
Number of pages	29
DOIs	https://doi.org/10.1007/978-3-031-04083-2_10
Publication status	Published - 2022
Externally published	Yes

Keywords

Explainable AI
Contribution Analysis
Reinforcement Learning
Credit Assignment
Reward Redistribution

Access to Document

10.1007/978-3-031-04083-2_10

https://doi.org/10.1007/978-3-031-04083-2_10

Cite this

Dinu, M.-C., Hofmarcher, M., Patil, V. P., Dorfer, M., Blies, P. M., Brandstetter, J., Arjona-Medina, J. A., & Hochreiter, S. (2022). XAI and Strategy Extraction via Reward Redistribution. In A. Holzinger, R. Goebel, R. Fong, T. Moon, K.-R. Müller, & W. Samek (Eds.), xxAI - Beyond Explainable AI: International Workshop, Held in Conjunction with ICML 2020, July 18, 2020, Vienna, Austria, Revised and Extended Papers (pp. 177-205). https://doi.org/10.1007/978-3-031-04083-2_10

Dinu, Marius-Constantin ; Hofmarcher, Markus ; Patil, Vihang P. et al. / XAI and Strategy Extraction via Reward Redistribution. xxAI - Beyond Explainable AI: International Workshop, Held in Conjunction with ICML 2020, July 18, 2020, Vienna, Austria, Revised and Extended Papers. editor / Andreas Holzinger ; Randy Goebel ; Ruth Fong ; Taesup Moon ; Klaus-Robert Müller ; Wojciech Samek. Cham, 2022. pp. 177-205

@inbook{7251d1bb9a2e4503b4ad6614650f5869,

title = "XAI and Strategy Extraction via Reward Redistribution",

abstract = "In reinforcement learning, an agent interacts with an environment from which it receives rewards, that are then used to learn a task. However, it is often unclear what strategies or concepts the agent has learned to solve the task. Thus, interpretability of the agent's behavior is an important aspect in practical applications, next to the agent's performance at the task itself. However, with the increasing complexity of both tasks and agents, interpreting the agent's behavior becomes much more difficult. Therefore, developing new interpretable RL agents is of high importance. To this end, we propose to use Align-RUDDER as an interpretability method for reinforcement learning. Align-RUDDER is a method based on the recently introduced RUDDER framework, which relies on contribution analysis of an LSTM model, to redistribute rewards to key events. From these key events a strategy can be derived, guiding the agent's decisions in order to solve a certain task. More importantly, the key events are in general interpretable by humans, and are often sub-tasks; where solving these sub-tasks is crucial for solving the main task. Align-RUDDER enhances the RUDDER framework with methods from multiple sequence alignment (MSA) to identify key events from demonstration trajectories. MSA needs only a few trajectories in order to perform well, and is much better understood than deep learning models such as LSTMs. Consequently, strategies and concepts can be learned from a few expert demonstrations, where the expert can be a human or an agent trained by reinforcement learning. By substituting RUDDER's LSTM with a profile model that is obtained from MSA of demonstration trajectories, we are able to interpret an agent at three stages: First, by extracting common strategies from demonstration trajectories with MSA. Second, by encoding the most prevalent strategy via the MSA profile model and therefore explaining the expert's behavior. And third, by allowing the interpretation of an arbitrary agent's behavior based on its demonstration trajectories.",

keywords = "Explainable AI, Contribution Analysis, Reinforcement Learning, Credit Assignment, Reward Redistribution",

author = "Marius-Constantin Dinu and Markus Hofmarcher and Patil, {Vihang P.} and Matthias Dorfer and Blies, {Patrick M.} and Johannes Brandstetter and Arjona-Medina, {Jose A.} and Sepp Hochreiter",

year = "2022",

doi = "10.1007/978-3-031-04083-2_10",

language = "English",

isbn = "978-3-031-04083-2",

pages = "177--205",

editor = "Andreas Holzinger and Randy Goebel and Ruth Fong and Taesup Moon and Klaus-Robert M{\"u}ller and Wojciech Samek",

booktitle = "xxAI - Beyond Explainable AI: International Workshop, Held in Conjunction with ICML 2020, July 18, 2020, Vienna, Austria, Revised and Extended Papers",

}

Dinu, M-C, Hofmarcher, M, Patil, VP, Dorfer, M, Blies, PM, Brandstetter, J, Arjona-Medina, JA & Hochreiter, S 2022, XAI and Strategy Extraction via Reward Redistribution. in A Holzinger, R Goebel, R Fong, T Moon, K-R Müller & W Samek (eds), xxAI - Beyond Explainable AI: International Workshop, Held in Conjunction with ICML 2020, July 18, 2020, Vienna, Austria, Revised and Extended Papers. Cham, pp. 177-205. https://doi.org/10.1007/978-3-031-04083-2_10

XAI and Strategy Extraction via Reward Redistribution. / Dinu, Marius-Constantin; Hofmarcher, Markus; Patil, Vihang P. et al.
xxAI - Beyond Explainable AI: International Workshop, Held in Conjunction with ICML 2020, July 18, 2020, Vienna, Austria, Revised and Extended Papers. ed. / Andreas Holzinger; Randy Goebel; Ruth Fong; Taesup Moon; Klaus-Robert Müller; Wojciech Samek. Cham, 2022. p. 177-205.

Research output: Conference proceeding/Chapter in Book/Report/ › Chapter › peer-review

TY - CHAP

T1 - XAI and Strategy Extraction via Reward Redistribution

AU - Dinu, Marius-Constantin

AU - Hofmarcher, Markus

AU - Patil, Vihang P.

AU - Dorfer, Matthias

AU - Blies, Patrick M.

AU - Brandstetter, Johannes

AU - Arjona-Medina, Jose A.

AU - Hochreiter, Sepp

PY - 2022

Y1 - 2022

N2 - In reinforcement learning, an agent interacts with an environment from which it receives rewards, that are then used to learn a task. However, it is often unclear what strategies or concepts the agent has learned to solve the task. Thus, interpretability of the agent's behavior is an important aspect in practical applications, next to the agent's performance at the task itself. However, with the increasing complexity of both tasks and agents, interpreting the agent's behavior becomes much more difficult. Therefore, developing new interpretable RL agents is of high importance. To this end, we propose to use Align-RUDDER as an interpretability method for reinforcement learning. Align-RUDDER is a method based on the recently introduced RUDDER framework, which relies on contribution analysis of an LSTM model, to redistribute rewards to key events. From these key events a strategy can be derived, guiding the agent's decisions in order to solve a certain task. More importantly, the key events are in general interpretable by humans, and are often sub-tasks; where solving these sub-tasks is crucial for solving the main task. Align-RUDDER enhances the RUDDER framework with methods from multiple sequence alignment (MSA) to identify key events from demonstration trajectories. MSA needs only a few trajectories in order to perform well, and is much better understood than deep learning models such as LSTMs. Consequently, strategies and concepts can be learned from a few expert demonstrations, where the expert can be a human or an agent trained by reinforcement learning. By substituting RUDDER's LSTM with a profile model that is obtained from MSA of demonstration trajectories, we are able to interpret an agent at three stages: First, by extracting common strategies from demonstration trajectories with MSA. Second, by encoding the most prevalent strategy via the MSA profile model and therefore explaining the expert's behavior. And third, by allowing the interpretation of an arbitrary agent's behavior based on its demonstration trajectories.

AB - In reinforcement learning, an agent interacts with an environment from which it receives rewards, that are then used to learn a task. However, it is often unclear what strategies or concepts the agent has learned to solve the task. Thus, interpretability of the agent's behavior is an important aspect in practical applications, next to the agent's performance at the task itself. However, with the increasing complexity of both tasks and agents, interpreting the agent's behavior becomes much more difficult. Therefore, developing new interpretable RL agents is of high importance. To this end, we propose to use Align-RUDDER as an interpretability method for reinforcement learning. Align-RUDDER is a method based on the recently introduced RUDDER framework, which relies on contribution analysis of an LSTM model, to redistribute rewards to key events. From these key events a strategy can be derived, guiding the agent's decisions in order to solve a certain task. More importantly, the key events are in general interpretable by humans, and are often sub-tasks; where solving these sub-tasks is crucial for solving the main task. Align-RUDDER enhances the RUDDER framework with methods from multiple sequence alignment (MSA) to identify key events from demonstration trajectories. MSA needs only a few trajectories in order to perform well, and is much better understood than deep learning models such as LSTMs. Consequently, strategies and concepts can be learned from a few expert demonstrations, where the expert can be a human or an agent trained by reinforcement learning. By substituting RUDDER's LSTM with a profile model that is obtained from MSA of demonstration trajectories, we are able to interpret an agent at three stages: First, by extracting common strategies from demonstration trajectories with MSA. Second, by encoding the most prevalent strategy via the MSA profile model and therefore explaining the expert's behavior. And third, by allowing the interpretation of an arbitrary agent's behavior based on its demonstration trajectories.

KW - Explainable AI

KW - Contribution Analysis

KW - Reinforcement Learning

KW - Credit Assignment

KW - Reward Redistribution

U2 - 10.1007/978-3-031-04083-2_10

DO - 10.1007/978-3-031-04083-2_10

M3 - Chapter

SN - 978-3-031-04083-2

SP - 177

EP - 205

BT - xxAI - Beyond Explainable AI: International Workshop, Held in Conjunction with ICML 2020, July 18, 2020, Vienna, Austria, Revised and Extended Papers

A2 - Holzinger, Andreas

A2 - Goebel, Randy

A2 - Fong, Ruth

A2 - Moon, Taesup

A2 - Müller, Klaus-Robert

A2 - Samek, Wojciech

CY - Cham

ER -

Dinu MC, Hofmarcher M, Patil VP, Dorfer M, Blies PM, Brandstetter J et al. XAI and Strategy Extraction via Reward Redistribution. In Holzinger A, Goebel R, Fong R, Moon T, Müller KR, Samek W, editors, xxAI - Beyond Explainable AI: International Workshop, Held in Conjunction with ICML 2020, July 18, 2020, Vienna, Austria, Revised and Extended Papers. Cham. 2022. p. 177-205 doi: 10.1007/978-3-031-04083-2_10

XAI and Strategy Extraction via Reward Redistribution

Abstract

Keywords

Access to Document

Fingerprint

Cite this