Adversarial Models for Deterministic Finite Automata
- K. Zhang1, Q. Wang2 and C. Lee Giles1.
1 Information Sciences and Technology at Pennsylvania State University, United States. 2 School of Computer Science at McGill University, Canada.
Adversarial Models for Deterministic Finite Automata K. Zhang 1 , Q. - - PowerPoint PPT Presentation
Adversarial Models for Deterministic Finite Automata K. Zhang 1 , Q. Wang 2 and C. Lee Giles 1 . 1 Information Sciences and Technology at Pennsylvania State University, United States. 2 School of Computer Science at McGill University, Canada.
1 Information Sciences and Technology at Pennsylvania State University, United States. 2 School of Computer Science at McGill University, Canada.
Ø Adversarial Model & Transition Importance Ø Critical Pattern & Synchronizing Word Ø Evaluation
and language processing systems. [1] However, our understanding of DFA remains at a relatively coarse-grained level. It is crucial to explore the fine-grained characteristics.
learning model, [2] and there is little research work studying sensitivity through a model-level perspective.
related to synchronizing word. The bound of the length of synchronizing word , also known as Cerny conjecture, still remains a mystery.
Deterministic Finite Automata (DFA)
An example DFA accepts only binary numbers that are multiples of 3.
S
2
S
1
S 1 1 1 A DFA 𝑁 can be described by a five-tuple {Σ, 𝑅, 𝜀, 𝑟!, 𝐺}: Ø Σ is the input alphabet; Ø 𝑅 is a finite, non-empty set of states; Ø 𝑟! represents the initial state; Ø 𝐺 represents the set of accept states; Ø 𝜀 denotes a set of deterministic production rules.
Grammar Description 1 1* 2 (10)* 3 An odd number of consecutive 1s is always followed by an even number of consecutive 0s 4 Any string not containing “000” as a substring 5 Even number of 0s and even number of 1s 6 The difference between the number of 0s and the number of 1s is a multiple of 3 7 0*1*0*1*
Tomita Grammars [3] Regular Grammar
Vulnerability of neural networks: adversarial sample problem plagues most statistical and machine learning models. [2]
Affected Applications:
Ø Image recognition; Ø Sentiment analysis; Ø Malware Analysis: Manipulation of system calls; Ø Threatening for security critical applications, e.g. automatic driving, cyber-security, medical diagnosis, etc. ( 𝑦 = arg max
"
𝑀(𝑦, 𝑔)
𝐸(𝑦, 𝑦!) < 𝜁 . ( 𝑦: Data points that can “trick” the model into making incorrect predictions; 𝜁: Perturbation / Manipulation are often “tiny” w.r.t certain distance metric; 𝑀: Loss function; 𝑔 : Different models.
Ø Open a discussion on the model-level analysis and introduce a general scheme for adversarial models. Study the transition importance of a DFA through a model-level perturbation. Ø Study critical patterns that can be used for identifying a specific DFA. Develop an algorithm for finding the critical patterns of a DFA by transforming this task as a DFA synchronizing problem. Provide a theoretical approach for estimating the length of any existing perfect patterns. Ø The analysis on DFA models will help in research on the security of cyber-physical systems that are based on working DFAs, e.g., compilers, VLSI design, elevators, and ATMs.
In this paper, we aim to study individual DFAs for their fine-grained characteristics, including transition importance and critical patterns. Main Idea Transition importance Critical patterns Adversary Study Synchronizing word
In order to directly gain a better understanding of a DFA, we follow a similar approach but study the sensitivity of a DFA through model-level perturbations. A word in the input alphabet of the DFA which sends any state of the DFA to one and the same state. [4]
Here we propose to transform the adversarial example problem into the adversarial model problem, which considers model-level perturbations. To quantitatively evaluate the difference between two sets of strings accepted by different DFAs, here we introduce the following metric:
𝐽𝑃𝑉 𝐵, > 𝐵 = |𝑌 ∩ B 𝑌| |𝑌 ∪ B 𝑌| D 𝑔 = arg max
|$%$
!|&' E
"∈)
𝑀(𝑦, 𝑔)
Intersection over union (IOU) Theorem
𝐽𝑃𝑉 𝐵, > 𝐵 = (∑*+,
𝐵, + K 𝐵/))*(1⨂𝑟) ∑*+,
𝐵, + 𝐵/⨂K 𝐵/)*(𝑟⨂𝑟) − 1)%,
where p – initial state vector; q – set of accept states; 𝑁! = 1 0 ; 𝑁" = 0 1 .
The theorem directly provides a formulation of the adversarial model problem for DFA as an optimization
min
1",0 1#∈𝒰
∑*+,
𝐵, + 𝐵/⨂K 𝐵/)*(𝑟⨂𝑟) ∑*+,
𝐵, + K 𝐵/))*(1⨂𝑟)
𝐵, − K 𝐵, 4
/ + 𝐵/ − K
𝐵/ 4
/ = 2
Constraints: Ø The optimized matrices are transition matrices; Ø Only allows one transition substitution to be applied to one of the transition matrices; Ø Perturbed DFA remains strongly connected; Ø The set of accepted states remains the same; Ø Prevent changes to the absorbing states.
b = # strings in N with m a = # strings in P with m d = # strings in N w/o m c = # strings in P w/o m
Set P Set N with m w/o m
Another view to investigate the characteristics of a DFA: critical pattern Definition (Critical Pattern) Absolute pattern:
P 𝑛 = arg max
5 +6 |𝑄𝑠 5~$8(𝑧 ∈ 𝑄) −𝑄𝑠 5~$8 (𝑧 ∈ 𝑂)|
Relative pattern:
P 𝑛 = arg max
5 +6 |𝑄𝑠8∈9(𝑛~$𝑧) −𝑄𝑠8∈: (𝑛~$𝑧)|
𝑛~$𝑧 indicates that m is a factor of y
Definition (Perfect Absolute Pattern) We will focus on absolute pattern.
P 𝑛 = arg min
5∈1% |𝑛|
where 𝐵# = 𝑛 max
$
𝑄𝑠
$~!& 𝑧 ∈ 𝑄 −𝑄𝑠 $~!& 𝑧 ∈ 𝑂
= 1} . AP: |()*|
(+*
RP: |
( (+, − * *+- |
Comparison of two patterns
Ø A perfect absolute pattern describes a substring, which has minimal length among all absolute patterns and perfectly differentiates the strings from different disjoint sets; Ø Only polynomial and exponential class have absolute perfect patterns; [5] Ø An absorbing state naturally fits this synchronizing scheme. As such, we can set the absorbing state as the state to be synchronized.
Synchronizing word
is at most 𝑜(𝑜 − 1)/2 .
1 3 2
! ! " !," "
1 3 2
! ! "
4
",! " ",!
1 4 2 5 3
! ",! " " ! ! ",! !
Pattern: ‘bab’ Pattern: ‘babaab’ Pattern: ‘babaabaab’
Adversarial Models
1 2 3 4 5
!,# ! ! ! ! ! # # # #
1 2 3 4
! ! ! ! ! " " " "
1 3 2 4 5
!,# ! ! # # # ! ! # !
Critical Patterns
G3 G5 G7 Optimized value of IOU Random value of IOU Grammar 3 1.48e-3 0.342 Grammar 5 0.152 0.289 Grammar 7 0.025 0.225 Pattern Con. Prob. ab 0.6 0.674 bab 0.8 0.912 abab 1 1 Probability difference and confidence have positive correlation. G7
Ø This work extend the sample-level analysis framework proposed in prior work for feed-forward neural networks to model-level analysis scheme, and furthermore study the transition importance
Ø This work define the critical pattern to identify individual DFA and propose a synchronizing algorithm to effectively find the critical pattern. Furthermore, this work provide some theoretical analysis of the minimal length of the defined critical pattern. Ø This work will facilitate the research on the security of cyber-physical systems that are based on working DFAs.
Understand more complex models and DFAs used in real applications.
[1] Hopcroft, J. E., Motwani, R., & Ullman, J. D. (2001). Introduction to automata theory, languages, and computation. Acm Sigact News, 32(1), 60-65. [2] Goodfellow, I. J., Shlens, J., & Szegedy, C. (2014). Explaining and harnessing adversarial
[3] Tomita, M. (1982). Learning of Construction of Finite Automata from Examples Using Hill-
[4] Rystsov, I. C. (2004), "Černý's conjecture: retrospects and prospects", Proc. Worksh. Synchronizing Automata, Turku (WSA 2004). [5] Wang, Q., Zhang, K., Ororbia, I. I., Alexander, G., Xing, X., Liu, X., & Giles, C. L. (2018). A Comparative Study of Rule Extraction for Recurrent Neural Networks. arXiv preprint arXiv:1801.05420.
If you are interested in more details, please contact: kuz22@psu.edu