Coverage-based Greybox Fuzzing as Markov Chain Marcel Bohme, - PowerPoint PPT Presentation

Coverage-based Greybox Fuzzing as Markov Chain Marcel Bohme, Van-Thuan Pham, Abhik Roychoudhury School Of Computing, NUS, Singapore FM Update 2018 Presented by - Raveendra Kumar M, Animesh Basak Chowdhury TCS Research July 27, 2018 Some of the slides are adapted from Author’s presentation.

Introduction Fuzz testing is an automated testing technique that uncovers software error by executing the target program with large number of randomly generated test inputs. Three main approaches. ◮ Black-box fuzzing : Random testing 1 . ◮ White-box fuzzing: SAGE 2 . ◮ Grey-box fuzzing : American Fuzzy Lop 3 . 1Miller et al, An empirical study of Unix utilities, CACM, 1990. 2Goefroid et al, Automated whitebox fuzz testing, NDSS, 2008. 3Zalewski, http://lcamtuf.coredump.cx/afl/.

Grey-box fuzzing Black-Box Fuzzing → Open Loop Control System. GreyBox Fuzzing → Closed Loop Control System. Feedback Function H(s) ∼ Branch-Pair Coverage (Pair of consecutive nodes in a CFG) Target Instrumented Program P Program P' Generate New Execute P' Monitor Inputs from with . Coverage. . t g t ∈ T G Retain . t g T G = T G ∪ t g Is Yes Interesting behaviour? No Discard . t g

Grey-box fuzzing – Working example 𝒔 𝒏 "𝑏" ① 1 𝑗 = 0 𝑑𝑐 = 0 𝑠𝑓𝑏𝑒(𝑔𝑒, 𝑗𝑜𝑞, 20) 2 false 3 𝑗𝑜𝑞 𝑗 ! = ‘\0’ D true A false 4 𝑗𝑜𝑞[𝑗] == ‘𝑐’ C true Id input AB AC BA CA BD CD DE DF B 𝑑𝑐 = 𝑑𝑐 + 1 "𝑏" 5 1 1 1 1 6 𝑗 = 𝑗 + 1 false 𝑑𝑐 ≥ 5 8 F E 𝑏𝑐𝑝𝑠𝑢() 9 𝑠𝑓𝑢𝑣𝑠𝑜 𝑑𝑐 𝒇 𝒏

Grey-box fuzzing – Working example 𝒔 𝒏 "𝑏" ① 1 𝑗 = 0 ② ③ 𝑑𝑐 = 0 "𝑐" "𝑏𝑐" "𝑑"   𝑠𝑓𝑏𝑒(𝑔𝑒, 𝑗𝑜𝑞, 20) 2 false 3 𝑗𝑜𝑞 𝑗 ! = ‘\0’ D true A false 4 𝑗𝑜𝑞[𝑗] == ‘𝑐’ C true Id input AB AC BA CA BD CD DE DF B 𝑑𝑐 = 𝑑𝑐 + 1 "𝑏" 5 1 1 1 1 2 “b” 1 1 1 6 𝑗 = 𝑗 + 1 3 “ab” 1 1 1 1 1 “c” 1 1 1 false 𝑑𝑐 ≥ 5 8 F E 𝑏𝑐𝑝𝑠𝑢() 9 𝑠𝑓𝑢𝑣𝑠𝑜 𝑑𝑐 𝒇 𝒏

Grey-box fuzzing – Working example 𝒔 𝒏 "𝑏" ① 1 𝑗 = 0 ② ③ 𝑑𝑐 = 0 "𝑐" "𝑏𝑐" "𝑑"    𝑠𝑓𝑏𝑒(𝑔𝑒, 𝑗𝑜𝑞, 20) 2 false 3 𝑗𝑜𝑞 𝑗 ! = ‘\0’ D true A false 4 𝑗𝑜𝑞[𝑗] == ‘𝑐’ C true Id input AB AC BA CA BD CD DE DF B 𝑑𝑐 = 𝑑𝑐 + 1 "𝑏" 5 1 1 1 1 2 “b” 1 1 1 6 𝑗 = 𝑗 + 1 3 “ab” 1 1 1 1 1 “c” 1 1 1 false 𝑑𝑐 ≥ 5 8 F E 𝑏𝑐𝑝𝑠𝑢() 9 𝑠𝑓𝑢𝑣𝑠𝑜 𝑑𝑐 𝒇 𝒏

Grey-box fuzzing – Working example 𝒔 𝒏 "𝑏" ① 1 𝑗 = 0 ② ③ 𝑑𝑐 = 0 "𝑐" "𝑏𝑐" "𝑑"    𝑠𝑓𝑏𝑒(𝑔𝑒, 𝑗𝑜𝑞, 20) ⑤ ④ 2 "𝑑" "𝑐𝑐" " … " "𝑏𝑐𝑏" " … "   "𝑏𝑐𝑐"   false 3 𝑗𝑜𝑞 𝑗 ! = ‘\0’ D true A false 4 𝑗𝑜𝑞[𝑗] == ‘𝑐’ C true Id input AB AC BA CA BD CD DE DF B 𝑑𝑐 = 𝑑𝑐 + 1 "𝑏" 5 1 1 1 1 2 “b” 1 1 1 6 𝑗 = 𝑗 + 1 3 “ab” 1 1 1 1 1 4 “bb” 2 1 1 1 false 𝑑𝑐 ≥ 5 8 F 5 “aba” 1 2 1 1 1 1 E “ abb ” 2 1 1 1 1 1 𝑏𝑐𝑝𝑠𝑢() 9 𝑠𝑓𝑢𝑣𝑠𝑜 𝑑𝑐 𝒇 𝒏

Grey-box fuzzing algorithm Algorithm 1 Grey-box fuzzing algorithm Require: Program P , Initial non-crashing seeds I s . Ensure: Set of crashing inputs T C and a tree of test inputs T G for P . 1: T G = I s 2: Run P with I s and observe visit counts of branch pairs . 3: repeat 4: t = getNextInput() ⊲ t ∈ T G . 5: N = assignEnergy( t ) 6: T m = fuzzTestInput( t , N ) ⊲ T m : { t g | t g ∈ MUTATE ( t ) } 7: for all t g ∈ T m do 8: S g = run( P , t g ) 9: if S g = ⊥ then ⊲ Did t g caused a crash or hang ? 10: T C . add ( t g ) 11: else if isInterestingTestInput( t g , S g ) then 12: T G . add ( t g ) ⊲ Retain interesting test input 13: end if 14: end for 15: until User interrupt received. 16: return ( T G , T C )

N = assignEnergy(t) Let N=100. Let N 1 be the N ∗ a factor inversely proportional to t g ’s execution time. (Ranging from 0.1 for higher execution time to 3 times for lower execution times) Let N 2 be N 1 ∗ a factor based on number of branch pairs covered by t g . (Ranging from 0.25 for lower coverage to 3 times for higher coverage) Let N 3 be N 2 ∗ a factor based on cycle of t g ’s discovery and number of time t fuzzed. (Low = 1 to high = 4) Let N 4 be N 3 ∗ a factor based on depth of t g ’s discovery. (Low = 1 to high = 5) return N 4

Problem Statement BlackBox Fuzzing ◮ Assumption : 2 8 characters. 1 void crashme (char *s) { 2 ◮ Expected no. of testcase required 3 if(s[0] == ’b’) to catch the bug : 2 32 . 4 5 if(s[1] == ’a’) 6 Coverage-based GreyBox 7 if(s[2] == ’d’) 8 Fuzzing (CGF) 9 if(s[3] == ’!’) 10 ◮ Markov Chain modeling of CGF gives the expectation that 2 12 is 11 abort () ; 12 } minimum test required to catch the crash. Listing 1: Program crashes when ◮ Current CGF algorithms are string s == "bad!" independent of judicious energy assignment to interesting test vectors for further fuzzing.

Problem Statement 1 void crashme (char *s) { BlackBox Fuzzing 2 ◮ Assumption : 2 8 characters. 3 if(s[0] == ’b’) 4 ◮ Expected no. of testcase required 5 if(s[1] == ’a’) to catch the bug : 2 32 . 6 7 if(s[2] == ’d’) 8 Coverage-based GreyBox 9 if(s[3] == ’!’) Fuzzing (CGF) 10 11 abort () ; ◮ Markov Chain modeling of CGF 12 } gives the expectation that 2 12 Listing 2: Program crashes when tests are required to catch the string s == "bad!" crash. ◮ Current CGF algorithms are independent of judicious energy Objective assignment to interesting test Tune energy assignment scheme close vectors for further fuzzing. to ideal.

Some terminologies Branch Pair Tuple BP i : < bp i , C i > where, bp i - Branch Pair i , C i - Visit Count. Path: Sequence of branch pair tuples [ BP i , BP j . . . ] visited during the execution of the program P on a test vector t .

Basic Concepts : Probabilistic Modeling Random Variable Maps possible outcomes from Sample Space to a real valued number. X : Ω → R Conditional Probability Calculates probability of an event happening, given a partial information. P ( B | A ) = P ( B ∩ A ) / P ( A ) Stochastic Process Collection of Random Variables indexed by time.

Discrete Time Stochastic Process (DTSP) Sequence of random variables X 0 , X 1 , X 2 , . . Denoted by { X n } . Time: n = 0, 1, 2, . . . State Space: m-dimensional vector, s = ( s 1 , s 2 , . . . , s m ) Set of all values that the X n ’s can take. Also, X n takes one of m values, so X n ↔ s .

Discrete Time Markov Chain (DTMC) DTSP → Discrete time Markov Chain (DTMC) iff P [ X n + 1 = j | X n = i n , ..., X 0 = i 0 ] = P [ X n + 1 = j | X n = i n ] = P ij ( n ) (Markovian Property) Markov Property Future state is independent of the past given the present state is fully known/observable. P ij ( n ) : Probability of transition from state i to state j , at time n . This is also referred as one-step transition probability.

Rat Maze Problem as DTMC 1/2 1/3 1 2 3 1/3 1/2 1 2 3 1/2 1/3 1/3 1/4 1/2 1/3 4 5 6 1/3 1/4 4 5 6 1/4 1/3 7 8 9 1/3 1/2 1/4 1/3 1/3 1/2 Figure : A rat maze. Allowed 1/2 1/3 7 8 9 transitions are horizontal and 1/3 1/2 vertical neighbors. Figure : Markov Chain Modeling of Rat Maze Problem

Homogeneous DTMC DTMC → Homogeneous iff transition probabilities do not depend on the time n, i.e. P [ X n + 1 = j | Xn = i ] = P [ X 1 = j | X 0 = i ] = P ij . Transition matrix of Homogeneous DTMC P = [ P ij ] i , j ∈ E p 1 , 1 p 1 , 2 p 1 , 3 p 1 , 4   p 2 , 1 p 2 , 2 p 2 , 3 p 2 , 4   P = p 3 , 1 p 3 , 2 p 3 , 3 p 3 , 4   p 4 , 1 p 4 , 2 p 4 , 3 p 4 , 4

Coverage-Based Fuzzing as Homogeneous DTMC Coverage-based Greybox fuzzing can modeled as Timed homogeneous DTMC. State Space S = S + + S − . S + - Paths already explored by seeds T G . S − - Paths yet to be discovered by fuzzing t ∈ T G . Assumptions : Probability of exercising path i (undiscovered) from already generated input t j , is same as probability of creating test input t j from test vectors t i .

Coverage-based Greybox Fuzzing as Markov Chain Example � void crashme (char* s) { 1 − 2 − 10 1 **** if (s[0] == ’b’) 2 2 − 10 if (s[1] == ’a’) 3 if (s[2] == ’d’) 4 3 b*** if (s[3] == ’!’) 5 4 abort (); 2 − 10 6 } 7 � 2 + 2 − 10 1 ba** 2 − 10 1 4 + 2 − 9 bad* 1 4 − 2 − 10 2 − 10 • • Defining the coverage-based fuzzer: 2 − 8 bad! • Start with seed that is a random 4-letter word. 4 • Given a seed, the fuzzer chooses a letter and substitutes it. Presented by Marcel Böhme

Coverage-based Greybox Fuzzing as Markov Chain Marcel Bohme, - PowerPoint PPT Presentation

Coverage-based Greybox Fuzzing as Markov Chain Marcel Bohme, Van-Thuan Pham, Abhik Roychoudhury School Of Computing, NUS, Singapore FM Update 2018 Presented by - Raveendra Kumar M, Animesh Basak Chowdhury TCS Research July 27, 2018 Some of

ParmeSan: Sanitizer-guided Greybox Fuzzing USENIX 2020 *some pages borrowed from Zheyu Ma

Modern Fuzzing of Media-processing projects Max Moroz, FOSDEM 2017 Agenda Fuzzing

Markov Chains Markov Processes Discrete-time Markov Chains Continuous-time Markov Chains Dr

Coverage-guided Fuzzing of Individual Functions Without Source Code Alessandro Di Federico

Hidden Markov Models Discrete Markov Processes 1 Hidden Markov Models Hidden Markov Models 2

2000 2010 2015 2005 Blackbox Fuzzing Verification Whitebox Fuzzing Patrice Godefroid

Not All Coverage Measurements Are Equal Fuzzing by Coverage Accounting for Input Prioritization

Expanding the Reach of Fuzzing Caroline Lemieux September 8 th , 2020 Fuzzcon Europe

Markov Chain Monte Carlo Methods Michel Bierlaire michel.bierlaire@epfl.ch Transport and

Markov chain Monte Carlo Dr. Jarad Niemi STAT 544 - Iowa State University April 2, 2018 Jarad

Part 3 Markov Chain Modeling Markov Chain Model Stochastic model Amounts to sequence of

Markov chains and Hidden Markov Models 9000 Markov chains and HMMs We will discuss: Markov

CSCE 471/871 Lecture 3: Markov Chains Markov Chains and and Hidden Markov Models Hidden

through Coverage-guided Tracing Stefan Nagy Matthew Hicks snagy2@vt.edu mdhicks2@vt.edu

Wi-Fi Advanced Fuzzing Wi-Fi Advanced Fuzzing Laurent BUTTI France Tlcom / Orange

Fuzzing Kamailio Security testing the Kamailio SIP server with fuzzing Agenda About me

CENG 4480 Lecture 11: Clock Bei Yu Reference : Chapter 11 Clock Distribution High speed

2.1 Deterministic Finite Acceptors Deterministic Acceptors & Transition Graphs Defn. 2.1 A

Use of the AT Gene f h T G eric Matching in PAT M h P T Roger Wolf Jan V Roger Wolf, Jan V

Glass transitions, and cooperative length scales Chiara Cammarota 26. 8. 2014 Cargse Questions

Long-range correlations in ALICE at the LHC Igor Altsybeev St.Petersburg State University (for

Charmonium-nucleon interactions from 2+1 flavor lattice QCD Takuya Sugiura, Yoichi Ikeda,

RECSM Summer School: Machine Learning for Social Sciences Session 2.1: Introduction to

The Code Libera5on Founda5on About A.C Menes- Composer and

Coverage-based Greybox Fuzzing as Markov Chain Marcel Bohme, - PowerPoint PPT Presentation

Coverage-based Greybox Fuzzing as Markov Chain Marcel Bohme, Van-Thuan Pham, Abhik Roychoudhury School Of Computing, NUS, Singapore FM Update 2018 Presented by - Raveendra Kumar M, Animesh Basak Chowdhury TCS Research July 27, 2018 Some of

ParmeSan: Sanitizer-guided Greybox Fuzzing USENIX 2020 *some pages borrowed from Zheyu Ma

Modern Fuzzing of Media-processing projects Max Moroz, FOSDEM 2017 Agenda Fuzzing

Markov Chains Markov Processes Discrete-time Markov Chains Continuous-time Markov Chains Dr

Coverage-guided Fuzzing of Individual Functions Without Source Code Alessandro Di Federico

Hidden Markov Models Discrete Markov Processes 1 Hidden Markov Models Hidden Markov Models 2

2000 2010 2015 2005 Blackbox Fuzzing Verification Whitebox Fuzzing Patrice Godefroid

Not All Coverage Measurements Are Equal Fuzzing by Coverage Accounting for Input Prioritization

Expanding the Reach of Fuzzing Caroline Lemieux September 8 th , 2020 Fuzzcon Europe

Markov Chain Monte Carlo Methods Michel Bierlaire michel.bierlaire@epfl.ch Transport and

Markov chain Monte Carlo Dr. Jarad Niemi STAT 544 - Iowa State University April 2, 2018 Jarad

Part 3 Markov Chain Modeling Markov Chain Model Stochastic model Amounts to sequence of

Markov chains and Hidden Markov Models 9000 Markov chains and HMMs We will discuss: Markov

CSCE 471/871 Lecture 3: Markov Chains Markov Chains and and Hidden Markov Models Hidden

through Coverage-guided Tracing Stefan Nagy Matthew Hicks snagy2@vt.edu mdhicks2@vt.edu

Wi-Fi Advanced Fuzzing Wi-Fi Advanced Fuzzing Laurent BUTTI France Tlcom / Orange

Fuzzing Kamailio Security testing the Kamailio SIP server with fuzzing Agenda About me

CENG 4480 Lecture 11: Clock Bei Yu Reference : Chapter 11 Clock Distribution High speed

2.1 Deterministic Finite Acceptors Deterministic Acceptors &amp; Transition Graphs Defn. 2.1 A

Use of the AT Gene f h T G eric Matching in PAT M h P T Roger Wolf Jan V Roger Wolf, Jan V

Glass transitions, and cooperative length scales Chiara Cammarota 26. 8. 2014 Cargse Questions

Long-range correlations in ALICE at the LHC Igor Altsybeev St.Petersburg State University (for

Charmonium-nucleon interactions from 2+1 flavor lattice QCD Takuya Sugiura, Yoichi Ikeda,

RECSM Summer School: Machine Learning for Social Sciences Session 2.1: Introduction to

The Code Libera5on Founda5on About A.C Menes- Composer and

2.1 Deterministic Finite Acceptors Deterministic Acceptors & Transition Graphs Defn. 2.1 A