Verification of RNN-Based Neural Agent-Environment Systems Michael - - PowerPoint PPT Presentation

verification of rnn based neural agent environment systems
SMART_READER_LITE
LIVE PREVIEW

Verification of RNN-Based Neural Agent-Environment Systems Michael - - PowerPoint PPT Presentation

Verification of RNN-Based Neural Agent-Environment Systems Michael Akintunde , Andreea Kevorchian, Alessio Lomuscio, Edoardo Pirovano Imperial College London, UK VNN 2019, Stanford, California This work We introduce Recurrent Neural


slide-1
SLIDE 1

Verification of RNN-Based Neural Agent-Environment Systems

Michael Akintunde, Andreea Kevorchian, Alessio Lomuscio, Edoardo Pirovano

Imperial College London, UK

VNN 2019, Stanford, California

slide-2
SLIDE 2

This work

We introduce Recurrent Neural Agent-Environment Systems to formalise RNN-based agents interacting with an environment with non-linear dynamics. We define and study various verification problems for these systems. We define two methods to solve said verification problems. We present an implementation and report experimental results. The paper builds upon work from previous work (KR’18)

slide-3
SLIDE 3

Recurrent Neural Networks (RNNs)

Many approaches already exist to perform verification on single FFNNs and closed-loop systems with FFNN-based agents. RNNs, equipped with a state that evolves over time, are designed to process sequences of data

h

  • x

W(h→o) W(h→h) W(i→h) Unroll ht−1

  • t−1

xt−1 ht

  • t

xt ht+1

  • t+1

xt+1 W(h→h) W(h→h) W(h→h) W(h→h) W(h→o) W(i→h) W(h→o) W(i→h) W(h→o) W(i→h)

slide-4
SLIDE 4

Single-Layer Recurrent Neural Networks (RNNs)

Definition A single-layer recurrent neural network (RNN) R with h hidden units and input size i and output size o is a neural network associated with the weight matrices W(i→h) ∈ Ri×h, W(h→h) ∈ Rh×h and W(h→o) ∈ Rh×o, and the two activation functions σ : Rh → Rh and σ′ : Ro → Ro. Here we assume the activation functions σ = σ′ = ReLU.

slide-5
SLIDE 5

Function Computed by an RNN

Definition (Function computed by RNN) For an RNN R with weight matrices W(i→h), W(h→h) and W(h→o), let x ∈ (Rk)n denote an input sequence of length n where each element of the sequence is a vector of size k, with xt denoting the t-th vector of x. We define hx

0 = 0 as a vector of 0s.

For each time step 1 ≤ t ≤ n, we define: hx

t = σ(W(h→h)hx t−1 + W(i→h)xt).

Then, the output of the RNN is given by f(x) = σ′(W(h→o)hx

n).

slide-6
SLIDE 6

Recurrent Neural Agent-Environment Systems

Definition (RNN-AES) A Recurrent Neural Agent-Environment System (RNN-AES) is a tuple AES = (Ag, E, I) where: Ag is a recurrent neural agent with action function act : O∗ → Act, E = (S, O, o, tE) is an environment with

state space S ⊆ Rm,

  • bservation space O ⊆ Rm′,
  • bservation function o : S → O and

transition function tE : S × Act → S,

I ⊆ S is a set of initial states. Paths are sequences of env state observations determined by the transition function tE from an initial state. We assume linearly definable AES (both tE and I).

slide-7
SLIDE 7

Bounded Specifications

Definition (Specifications) For an environment with state space S ⊆ Rm, we consider a fragment of LTL given by the following BNF: φ ::= XkC | CU≤kC C ::= C ∨ C | (i) op (j) | (i) op x where op ∈ {<, ≤, =, =, ≥, >} , i, j ∈ {1, . . . , m} , x ∈ R, k ∈ N.

slide-8
SLIDE 8

Satisfaction

Satisfaction relation | = is defined as follows: Definition (Satisfaction) Given a path ρ ∈ Π on an RNN-AES and a formula φ: ρ | = (i) op (j) iff ρ(0).i op ρ(0).j holds; ρ | = C1 ∨ C2 iff ρ | = C1 or ρ | = C2; ρ | = XkC iff ρ(k) | = C; ρ | = C1U≤kC2 iff there is some i ≤ k such that ρ(i) | = C2 and ρ(j) | = C1 for all 0 ≤ j < i.

slide-9
SLIDE 9

Verification problem

We say that an agent-environment system AES satisfies a specification φ if it is the case that every path originating from an initial state i ∈ I satisfies φ, denoted AES | = φ. This is the basis of the verification problem: Definition (Verification problem) Determine if given an RNN-AES AES and a formula φ, it is the case that AES | = φ.

slide-10
SLIDE 10

Approach: Unrolling RNNs to FFNNs

Example: How to construct an FFNN from an RNN with input sequence of length 4, input size of 2, 3 hidden units and output size 1 (single output)?

h0 h1 x1 h2 x2 h3 x3 h4

  • x4

W(h→h) W(h→h) W(h→h) W(h→h) W(i→h) W(i→h) W(i→h) W(h→o) W(i→h)

slide-11
SLIDE 11

Approach: Unrolling RNNs to FFNNs

Input on Start (IOS)

Scale input values according to the weights of W(i→h). At each time step when the input is needed, pass it unchanged to the corresponding hidden layer of the FFNN.

x11 x12 x21 x22 x31 x32 x41 x42

  • FFNN constructed from RNN with length 4 input sequence, input size of

2, 3 hidden units and output size 1.

slide-12
SLIDE 12

Approach: Unrolling RNNs to FFNNs

Input on Demand (IOD)

At the time step when the input term is needed, scale the input (on demand) and pass to the corresponding hidden layer of the FFNN, otherwise propogate the term’s original value.

x11 x12 x21 x22 x31 x32 x41 x42

  • FFNN constructed from RNN with length 4 input sequence, input size of

2, 3 hidden units and output size 1.

slide-13
SLIDE 13

Equivalences

Theorem For an RNN-AES AES and a specification φk, AES | = φk iff IOD(AES) | = φk iff IOS(AES) | = φk. Verification on bounded specifications of RNN-AES can be recast as FFNN-AES verification. See paper for further details

  • f the unrolling methods.

Verification for FFNN-AES addressed in KR’18 paper.

slide-14
SLIDE 14

MILP Encoding for ReLU-FFNN

Maganti & Lomuscio, 2017, Cheng, Nührenberg & Ruess, 2017

ReLU activation function x(i)

j

= max

  • 0, W(i)

j x(i−1) + b(i) j

  • ,

j = 1 · · · |L(i)| Active phase: x(i)

j

= W(i)

j x(i−1) + b(i) j

(set ¯ δ(i)

j

= 0) Inactive phase: x(i)

j

= 0 (set ¯ δ(i)

j

= 1) Value of ¯ δj forces two of the four constraints to become vacuously true, and the other two correspond exactly to inactive/active phase of neuron: x(i)

j

≥ W(i)

j x(i−1) + b(i) j

x(i)

j

≤ W(i)

j x(i−1) + b(i) j

+ M ¯ δ(i)

j

x(i)

j

≥ 0 x(i)

j

≤ M(1 − ¯ δ(i)

j )

slide-15
SLIDE 15

Verifying RNN-AESs via MILP

Theorem The MILP PFFNN is feasible for ¯ x(1) = ¯ x, ¯ x(m) = ¯ y iff fNN(¯ x) = ¯ y. Verification problem can be solved via MILP by considering the linear programming problem defined on the unrolled RNN truncated by the bound on the spec. Theorem Verification of RNN-AESs against bounded specifications is coNP-complete.

slide-16
SLIDE 16

Verification Procedure

Goal: Take RNN-AES AES = (AgN,

Environment E

  • (S, O, o, tE), I) and a

specification φ. Return whether φ is satisfied on the system. For XkC: For each step n from 0 → k, add constraints corresponding to the observation function, the unrolling of length n of the RNN and the transition function of the environment Check whether ¯ C can be satisfied in any of the states possible after k steps, and return result accordingly.

slide-17
SLIDE 17

Verification Procedure

For C1U≤kC2: For each n from 0 to k, check whether C2 is always satisfied in valid paths of length n that have not already had C2 satisfied earlier on. If so, return True. Otherwise, continue from states not satisfying C2. Check if not all of these satisfy C1. If so, return False. Otherwise, we’re on a valid path. Continue to add the constraints corresponding to the observation function, the unrolling of length n of the RNN and the transition function

  • f the environment. Iterate to n + 1.

If reached n = k without a result returned, there must exist a path of length k along which C2 is never satisfied, and so we return False.

slide-18
SLIDE 18

RNSVerify

Experimental toolkit produced, solving desired verification problems. Takes as input an RNN-AES, property φ and produces associated MILP problem. Fed to Gurobi 7.5.2 to solve. If output is False, counterexample in the form of a trace is shown.

slide-19
SLIDE 19

Example: OpenAI Pendulum

Brockman et. al, 2016

Example (Pendulum) OpenAI Gym task Pendulum-v0: System composed of a pendulum and an agent which can apply a force to the pendulum. Agent can observe the current angle θ of the pendulum (θ = 0 indicates that it is perfectly vertical) and the pendulum’s angular velocity ˙ θ. Agent chooses a small torque to be applied to the pendulum at each time step. Aim: Learn how to keep the pendulum upright by applying torque at each time step.

slide-20
SLIDE 20

Example: OpenAI Pendulum

Brockman et. al, 2016

slide-21
SLIDE 21

Evaluation: OpenAI Pendulum

Agent observes the angle and angular velocity and applies a torque to keep it vertical. Encoded as a RNN-AES: agent-environment system, non-linear transition function, and sequence of env state

  • bservations.

Agent’s policy synthesised using Q-Learning on a ReLU-RNN. Env approximated from data (since env is non linear). RNSVerify found several bugs in the synthesised agent, e.g., the agent would apply the torque incorrectly in some situations.

slide-22
SLIDE 22

Verification Results

Input on Start – Evaluation on Pendulum [OpenAI, 2018]

Check the property Xn(θf > −ε) for different values of n and ε using IOS. Fix (θi, ˙ θi) ∈ [0, π/64] × [0, 0.3]. ε π/10 π/30 π/50 π/70 1 0.056s 0.067s 0.011s 0.014s 2 0.052s 0.179s 0.138s 0.197s 3 0.372s 0.904s 5.794s 0.552s 4 2.578s 7.222s 0.378s 0.368s 5 20.57s 31.07s 0.748s 0.663s 6 73.97s 3.264s 31.07s 23.99s 7 54.30s 96.54s 116.8s 207.8s n 8 693.2s 294.9s 239.8s 243.3s Greyed areas denote False result, hence insufficiently trained system.

slide-23
SLIDE 23

Verification Results

Input on Demand – Evaluation on Pendulum [OpenAI, 2018]

Check the property Xn(θf > −ε) for different values of n and ε using IOD. Fix (θi, ˙ θi) ∈ [0, π/64] × [0, 0.3]. ε π/10 π/30 π/50 π/70 1 0.004s 0.012s 0.011s 0.014s 2 0.060s 0.114s 0.244s 0.253s 3 0.247s 1.068s 6.092s 0.125s 4 2.176s 5.359s 0.182s 0.198s 5 10.04s 0.293s 0.317s 0.294s 6 13.99s 0.367s 0.357s 0.359s 7 31.93s 0.497s 0.488s 0.478s n 8 0.689s 0.660s 0.696s 0.703s Greyed areas denote False result, hence insufficiently trained system.

slide-24
SLIDE 24

Number of Constraints and Variables

n Input on Start Input on Demand V C V C 1 273 336 273 336 2 736 736 620 766 3 1455 1806 1055 1306 4 2494 3101 1590 1971 5 3917 4876 2237 2776 6 5788 7211 3008 3736 7 8171 10186 3915 4866 8 11130 13881 4970 6181

Table: For different values of n, size of constraint problem constructed by RNSVerify w.r.t number of variables (V) and constraints (C) when checking Xn(θf > −ε). We observe a degradation in performance with the length of the paths.

slide-25
SLIDE 25

Conclusions

Increased attention to verifiable AI. First approach on verification of a closed-loop system composed of a neural agent based on an ReLU-RNN. Sound and complete procedure produced, effective for controllers of limited complexity. Approach is independent of the underlying solver.