Adversarial Models for Deterministic Finite Automata K. Zhang 1 , Q. - - PowerPoint PPT Presentation

adversarial models for deterministic finite automata
SMART_READER_LITE
LIVE PREVIEW

Adversarial Models for Deterministic Finite Automata K. Zhang 1 , Q. - - PowerPoint PPT Presentation

Adversarial Models for Deterministic Finite Automata K. Zhang 1 , Q. Wang 2 and C. Lee Giles 1 . 1 Information Sciences and Technology at Pennsylvania State University, United States. 2 School of Computer Science at McGill University, Canada.


slide-1
SLIDE 1

Adversarial Models for Deterministic Finite Automata

  • K. Zhang1, Q. Wang2 and C. Lee Giles1.

1 Information Sciences and Technology at Pennsylvania State University, United States. 2 School of Computer Science at McGill University, Canada.

slide-2
SLIDE 2

Outline

  • Motivation
  • Background
  • Contribution
  • Main Idea
  • Main Results

Ø Adversarial Model & Transition Importance Ø Critical Pattern & Synchronizing Word Ø Evaluation

  • Summary and Future work
  • References
slide-3
SLIDE 3

Motivation

  • Deterministic Finite Automata (DFA) has been widely used in computer language compilers

and language processing systems. [1] However, our understanding of DFA remains at a relatively coarse-grained level. It is crucial to explore the fine-grained characteristics.

  • Most prior work focus on identifying feature-level perturbations that significantly affect a

learning model, [2] and there is little research work studying sensitivity through a model-level perspective.

  • Critical pattern is another important characteristic to identify a specific DFA, which is closely

related to synchronizing word. The bound of the length of synchronizing word , also known as Cerny conjecture, still remains a mystery.

slide-4
SLIDE 4

Background – DFA & Regular Grammar

Deterministic Finite Automata (DFA)

An example DFA accepts only binary numbers that are multiples of 3.

S

2

S

1

S 1 1 1 A DFA 𝑁 can be described by a five-tuple {Σ, 𝑅, 𝜀, 𝑟!, 𝐺}: Ø Σ is the input alphabet; Ø 𝑅 is a finite, non-empty set of states; Ø 𝑟! represents the initial state; Ø 𝐺 represents the set of accept states; Ø 𝜀 denotes a set of deterministic production rules.

Grammar Description 1 1* 2 (10)* 3 An odd number of consecutive 1s is always followed by an even number of consecutive 0s 4 Any string not containing “000” as a substring 5 Even number of 0s and even number of 1s 6 The difference between the number of 0s and the number of 1s is a multiple of 3 7 0*1*0*1*

Tomita Grammars [3] Regular Grammar

slide-5
SLIDE 5

Vulnerability of neural networks: adversarial sample problem plagues most statistical and machine learning models. [2]

Background – Adversarial Sample

Affected Applications:

Ø Image recognition; Ø Sentiment analysis; Ø Malware Analysis: Manipulation of system calls; Ø Threatening for security critical applications, e.g. automatic driving, cyber-security, medical diagnosis, etc. ( 𝑦 = arg max

"

𝑀(𝑦, 𝑔)

  • s. t.

𝐸(𝑦, 𝑦!) < 𝜁 . ( 𝑦: Data points that can “trick” the model into making incorrect predictions; 𝜁: Perturbation / Manipulation are often “tiny” w.r.t certain distance metric; 𝑀: Loss function; 𝑔 : Different models.

slide-6
SLIDE 6

Contribution

Ø Open a discussion on the model-level analysis and introduce a general scheme for adversarial models. Study the transition importance of a DFA through a model-level perturbation. Ø Study critical patterns that can be used for identifying a specific DFA. Develop an algorithm for finding the critical patterns of a DFA by transforming this task as a DFA synchronizing problem. Provide a theoretical approach for estimating the length of any existing perfect patterns. Ø The analysis on DFA models will help in research on the security of cyber-physical systems that are based on working DFAs, e.g., compilers, VLSI design, elevators, and ATMs.

slide-7
SLIDE 7

In this paper, we aim to study individual DFAs for their fine-grained characteristics, including transition importance and critical patterns. Main Idea Transition importance Critical patterns Adversary Study Synchronizing word

In order to directly gain a better understanding of a DFA, we follow a similar approach but study the sensitivity of a DFA through model-level perturbations. A word in the input alphabet of the DFA which sends any state of the DFA to one and the same state. [4]

Main Idea

slide-8
SLIDE 8

Here we propose to transform the adversarial example problem into the adversarial model problem, which considers model-level perturbations. To quantitatively evaluate the difference between two sets of strings accepted by different DFAs, here we introduce the following metric:

𝐽𝑃𝑉 𝐵, > 𝐵 = |𝑌 ∩ B 𝑌| |𝑌 ∪ B 𝑌| D 𝑔 = arg max

|$%$

!|&' E

"∈)

𝑀(𝑦, 𝑔)

Intersection over union (IOU) Theorem

𝐽𝑃𝑉 𝐵, > 𝐵 = (∑*+,

  • (1⨂𝑞). (𝑁,⨂ 𝐵, + 𝐵/ + 𝑁/⨂(K

𝐵, + K 𝐵/))*(1⨂𝑟) ∑*+,

  • (𝑞⨂𝑞).(𝐵,⨂K

𝐵, + 𝐵/⨂K 𝐵/)*(𝑟⨂𝑟) − 1)%,

where p – initial state vector; q – set of accept states; 𝑁! = 1 0 ; 𝑁" = 0 1 .

Adversarial Model & Transition Importance

slide-9
SLIDE 9

Adversarial Model & Transition Importance

The theorem directly provides a formulation of the adversarial model problem for DFA as an optimization

  • problem. Then by introducing several additional constraints, we solve the following problem:

min

1",0 1#∈𝒰

∑*+,

  • (𝑞⨂𝑞).(𝐵,⨂K

𝐵, + 𝐵/⨂K 𝐵/)*(𝑟⨂𝑟) ∑*+,

  • (1⨂𝑞). (𝑁,⨂ 𝐵, + 𝐵/ + 𝑁/⨂(K

𝐵, + K 𝐵/))*(1⨂𝑟)

  • s. t.

𝐵, − K 𝐵, 4

/ + 𝐵/ − K

𝐵/ 4

/ = 2

Constraints: Ø The optimized matrices are transition matrices; Ø Only allows one transition substitution to be applied to one of the transition matrices; Ø Perturbed DFA remains strongly connected; Ø The set of accepted states remains the same; Ø Prevent changes to the absorbing states.

slide-10
SLIDE 10

b = # strings in N with m a = # strings in P with m d = # strings in N w/o m c = # strings in P w/o m

Set P Set N with m w/o m

Critical Pattern & Synchronizing Word

Another view to investigate the characteristics of a DFA: critical pattern Definition (Critical Pattern) Absolute pattern:

P 𝑛 = arg max

5 +6 |𝑄𝑠 5~$8(𝑧 ∈ 𝑄) −𝑄𝑠 5~$8 (𝑧 ∈ 𝑂)|

Relative pattern:

P 𝑛 = arg max

5 +6 |𝑄𝑠8∈9(𝑛~$𝑧) −𝑄𝑠8∈: (𝑛~$𝑧)|

𝑛~$𝑧 indicates that m is a factor of y

Definition (Perfect Absolute Pattern) We will focus on absolute pattern.

P 𝑛 = arg min

5∈1% |𝑛|

where 𝐵# = 𝑛 max

$

𝑄𝑠

$~!& 𝑧 ∈ 𝑄 −𝑄𝑠 $~!& 𝑧 ∈ 𝑂

= 1} . AP: |()*|

(+*

RP: |

( (+, − * *+- |

Comparison of two patterns

slide-11
SLIDE 11

Ø A perfect absolute pattern describes a substring, which has minimal length among all absolute patterns and perfectly differentiates the strings from different disjoint sets; Ø Only polynomial and exponential class have absolute perfect patterns; [5] Ø An absorbing state naturally fits this synchronizing scheme. As such, we can set the absorbing state as the state to be synchronized.

Synchronizing word

  • Theorem. The length of a perfect absolute pattern of a DFA with n states

is at most 𝑜(𝑜 − 1)/2 .

  • Theorem. The length of a absolute pattern of a 5-state DFA is at most 9.

1 3 2

! ! " !," "

1 3 2

! ! "

4

",! " ",!

1 4 2 5 3

! ",! " " ! ! ",! !

Pattern: ‘bab’ Pattern: ‘babaab’ Pattern: ‘babaabaab’

Critical Pattern & Synchronizing Word

slide-12
SLIDE 12

Adversarial Models

1 2 3 4 5

!,# ! ! ! ! ! # # # #

1 2 3 4

! ! ! ! ! " " " "

1 3 2 4 5

!,# ! ! # # # ! ! # !

Critical Patterns

G3 G5 G7 Optimized value of IOU Random value of IOU Grammar 3 1.48e-3 0.342 Grammar 5 0.152 0.289 Grammar 7 0.025 0.225 Pattern Con. Prob. ab 0.6 0.674 bab 0.8 0.912 abab 1 1 Probability difference and confidence have positive correlation. G7

Evaluation

slide-13
SLIDE 13

Summary & Future Work

Ø This work extend the sample-level analysis framework proposed in prior work for feed-forward neural networks to model-level analysis scheme, and furthermore study the transition importance

  • f DFA under this scheme.

Ø This work define the critical pattern to identify individual DFA and propose a synchronizing algorithm to effectively find the critical pattern. Furthermore, this work provide some theoretical analysis of the minimal length of the defined critical pattern. Ø This work will facilitate the research on the security of cyber-physical systems that are based on working DFAs.

Summary Future Work

Understand more complex models and DFAs used in real applications.

slide-14
SLIDE 14

References

[1] Hopcroft, J. E., Motwani, R., & Ullman, J. D. (2001). Introduction to automata theory, languages, and computation. Acm Sigact News, 32(1), 60-65. [2] Goodfellow, I. J., Shlens, J., & Szegedy, C. (2014). Explaining and harnessing adversarial

  • examples. arXiv preprint arXiv:1412.6572.

[3] Tomita, M. (1982). Learning of Construction of Finite Automata from Examples Using Hill-

  • Climbing. RR: Regular Set Recognizer (No. CMU-CS-82-127).

[4] Rystsov, I. C. (2004), "Černý's conjecture: retrospects and prospects", Proc. Worksh. Synchronizing Automata, Turku (WSA 2004). [5] Wang, Q., Zhang, K., Ororbia, I. I., Alexander, G., Xing, X., Liu, X., & Giles, C. L. (2018). A Comparative Study of Rule Extraction for Recurrent Neural Networks. arXiv preprint arXiv:1801.05420.

slide-15
SLIDE 15

Q & A

Thanks!

If you are interested in more details, please contact: kuz22@psu.edu