Abstract
Explaining and verifying the behavior of recurrent neural networks (RNNs) is an important step towards achieving confidence in machine learning. The extraction of finite state models, like deterministic automata, has been shown to be a promising concept for analyzing RNNs. In this paper, we apply a black-box approach based on active automata learning combined with model-guided conformance testing to learn finite state machines (FSMs) from RNNs. The technique efficiently infers a formal model of an RNN classifier’s input-output behavior, regardless of its inner structure. In several experiments, we compare this approach to other state-of-the-art FSM extraction methods. By detecting imprecise generalizations in RNNs that other techniques miss, model-guided conformance testing learns FSMs that more accurately model the RNNs under examination. We demonstrate this by identifying counterexamples with this testing approach that falsifies wrong hypothesis models learned by other techniques. This entails that testing guided by learned automata can be a useful method for finding adversarial inputs, that is, inputs incorrectly classified due to improper generalization.
Original language | English |
---|---|
Title of host publication | IFM 2022: Integrated Formal Methods |
Subtitle of host publication | International Conference on Integrated Formal Methods |
Publication status | Published - May 2022 |
Keywords
- Verifiable machine learning
- Active automata learning
- Finite state machines
- Recurrent neural networks