Abstract
In practical applications, we can rarely assume full observ-
ability of a system’s environment, despite such knowledge being impor-
tant for determining a reactive control system’s precise interaction with
its environment. Therefore, we propose an approach for reinforcement
learning (RL) in partially observable environments. While assuming that
the environment behaves like a partially observable Markov decision pro-
cess with known discrete actions, we assume no knowledge about its
structure or transition probabilities.
Our approach combines Q-learning with IoAlergia, a method for learn-
ing Markov decision processes (MDP). By learning MDP models of the
environment from episodes of the RL agent, we enable RL in partially ob-
servable domains without explicit, additional memory to track previous
interactions for dealing with ambiguities stemming from partial observ-
ability. We instead provide RL with additional observations in the form
of abstract environment states by simulating new experiences on learned
environment models to track the explored states. In our evaluation we
report on the validity of our approach and its promising performance
in comparison to six state-of-the-art deep RL techniques with recurrent
neural networks and fixed memory.
ability of a system’s environment, despite such knowledge being impor-
tant for determining a reactive control system’s precise interaction with
its environment. Therefore, we propose an approach for reinforcement
learning (RL) in partially observable environments. While assuming that
the environment behaves like a partially observable Markov decision pro-
cess with known discrete actions, we assume no knowledge about its
structure or transition probabilities.
Our approach combines Q-learning with IoAlergia, a method for learn-
ing Markov decision processes (MDP). By learning MDP models of the
environment from episodes of the RL agent, we enable RL in partially ob-
servable domains without explicit, additional memory to track previous
interactions for dealing with ambiguities stemming from partial observ-
ability. We instead provide RL with additional observations in the form
of abstract environment states by simulating new experiences on learned
environment models to track the explored states. In our evaluation we
report on the validity of our approach and its promising performance
in comparison to six state-of-the-art deep RL techniques with recurrent
neural networks and fixed memory.
Originalsprache | Englisch |
---|---|
Titel | 18th International Conference on integrated Formal Methods |
Herausgeber (Verlag) | Springer |
Band | 14300 |
Auflage | LNCS |
Publikationsstatus | Veröffentlicht - Sep. 2023 |
Veranstaltung | 18th International Conference on integrated Formal Methods (iFM) - Dauer: 13 Nov. 2023 → 15 Nov. 2023 https://liacs.leidenuniv.nl/~bonsanguemm/ifm23/index.html |
Konferenz
Konferenz | 18th International Conference on integrated Formal Methods (iFM) |
---|---|
Kurztitel | iFM |
Zeitraum | 13/11/23 → 15/11/23 |
Internetadresse |