Stringer: Measuring the Importance of Static Data Comparisons to - PowerPoint PPT Presentation

Stringer: Measuring the Importance of Static Data Comparisons to Detect Backdoors and Undocumented Functionality Sam L. Thomas , Tom Chothia, Flavio D. Garcia School of Computer Science University of Birmingham Birmingham United Kingdom B15 2TT { s.l.thomas,t.p.chothia,f.garcia } @cs.bham.ac.uk European Symposium on Research in Computer Security (ESORICS) 2017 Thomas, Chothia, Garcia Stringer ESORICS 2017 1 / 41

Challenge How do we reduce the manual effort required to identify undocumented functionality and backdoors within software? Thomas, Chothia, Garcia Stringer ESORICS 2017 2 / 41

Challenge How do we reduce the manual effort required to identify undocumented functionality and backdoors within software? Thomas, Chothia, Garcia Stringer ESORICS 2017 3 / 41

Motivation Undocumented functionality? Backdoors? Authentication bypass by “magic” words. Hard-coded credential checks. Additional protocol messages that activate unexpected functionality. Thomas, Chothia, Garcia Stringer ESORICS 2017 4 / 41

Application Focus on embedded device firmware – it’s a challenging target: Lots of devices, lots of firmware. Multiple firmware versions for each device. Impossible to manually analyse every firmware image. Thomas, Chothia, Garcia Stringer ESORICS 2017 5 / 41

Stringer Thomas, Chothia, Garcia Stringer ESORICS 2017 6 / 41

Objective Identify interesting code structures and static data comparisons that lead to backdoor-like behaviour. Lightweight analysis. Thomas, Chothia, Garcia Stringer ESORICS 2017 7 / 41

Method 1 Automatically identify static data comparison functions. 2 A metric for measuring the degree a binary’s functions branching is influenced by comparisons with static data. Thomas, Chothia, Garcia Stringer ESORICS 2017 8 / 41

Stringer For a given binary: 1 Identify all possible static data comparison functions: Thomas, Chothia, Garcia Stringer ESORICS 2017 9 / 41

Stringer 2 Label the basic blocks of all functions with the sets of static data sequences that must be matched against to reach them: Thomas, Chothia, Garcia Stringer ESORICS 2017 10 / 41

Stringer 3 Using the computed sets, calculate a score for each element of static data: A = 100 B = 200 . . . Thomas, Chothia, Garcia Stringer ESORICS 2017 11 / 41

Stringer 3 Using the computed sets, calculate a score for each element of static data: A = 100 B = 200 . . . 4 Finally, using the scores for each item of static data, compute a score for each function: f = 300 . . . Thomas, Chothia, Garcia Stringer ESORICS 2017 11 / 41

Identifying Static Data Comparison Functions Thomas, Chothia, Garcia Stringer ESORICS 2017 12 / 41

Identifying static data comparison functions Approach based upon concrete observations: Analyse calls to static data comparison functions in C/C++ binaries. Collect properties that are common amonst them: call-sites, number of arguments, how they influence branching, . . . Thomas, Chothia, Garcia Stringer ESORICS 2017 13 / 41

Motivating Example HTTP protocol parser from mini httpd binary: Thomas, Chothia, Garcia Stringer ESORICS 2017 14 / 41

Call-site Properties Argument references : at least one argument refers to the data/read-only data section: Thomas, Chothia, Garcia Stringer ESORICS 2017 15 / 41

Call-site Properties Function arity : (number of arguments passed): usually 2-3: Thomas, Chothia, Garcia Stringer ESORICS 2017 16 / 41

Call-site Properties Branching properties : boolean comparison (i.e. matches or not): Thomas, Chothia, Garcia Stringer ESORICS 2017 17 / 41

Call-site Properties Local call frequency : (for parsers: use same comparison function many times with different static data): Thomas, Chothia, Garcia Stringer ESORICS 2017 18 / 41

Data Properties Identify static data properties (with parsers in mind): Thomas, Chothia, Garcia Stringer ESORICS 2017 19 / 41

Finding Static Data Comparisons 1 For each function, identify blocks that contain function calls. 2 Filter those blocks where the function call does not influence branching or the comparison condition is not boolean. Thomas, Chothia, Garcia Stringer ESORICS 2017 20 / 41

Finding Static Data Comparisons (cont.) 3 For each argument, tag what it refers to: data section, read-only data section, other (e.g. register): Thomas, Chothia, Garcia Stringer ESORICS 2017 21 / 41

Finding Static Data Comparisons (cont.) 4 Using these assignments, update likelihood of function being a comparison function: Thomas, Chothia, Garcia Stringer ESORICS 2017 22 / 41

Assigning Scores to Static Data & Functions Thomas, Chothia, Garcia Stringer ESORICS 2017 23 / 41

Scoring Goals A means to discover those branches within each function that are dependent upon static data and assign them and the associated static data a score of relative importance in relation to other such branches within that function based upon how much unique functionality they guard. A function-level score that signifies which functions contain a relatively high density of decision logic that depends on comparison with static data (i.e. a large amount of their decision logic is influenced by comparison with static data). Thomas, Chothia, Garcia Stringer ESORICS 2017 24 / 41

Control Flow Properties Minimise the score propagated from join-points - blocks reached by many paths: Thomas, Chothia, Garcia Stringer ESORICS 2017 25 / 41

Control Flow Properties Maximise score of blocks that guard unique functionality - can’t be reached by any other path: Thomas, Chothia, Garcia Stringer ESORICS 2017 26 / 41

Computation of Scores Two stage process: 1 Compute static data sequences: sets of sequences of static data that must be matched to reach each block. 2 Distribute scores based upon computed static data sequences. Thomas, Chothia, Garcia Stringer ESORICS 2017 27 / 41

Computation of Static Data Sequences Compute sets of sequences of static data that must be matched to reach a given block: Thomas, Chothia, Garcia Stringer ESORICS 2017 28 / 41

Computation of Static Data Scores 1 For each block’s static data set of sequences, we calculate a fraction of how each element of static data impacts the reachability to that block; e.g. for block 6: Thomas, Chothia, Garcia Stringer ESORICS 2017 29 / 41

Computation of Static Data Scores 1 For each block’s static data set of sequences, we calculate a fraction of how each element of static data impacts the reachability to that block; e.g. for node 6: We have: { [ A ] , [ A , B , C ] } , so we calculate: A : 2 2 , B : 1 2 , C : 1 2 . Thomas, Chothia, Garcia Stringer ESORICS 2017 30 / 41

Computation of Static Data Scores 2 We calculate two other values for the block ( b ): 1 ω ( b ) deg in ( b ) A base score for the block The penalty incurred for being reachable by multiple blocks Thomas, Chothia, Garcia Stringer ESORICS 2017 31 / 41

Computation of Static Data Scores 3 . . . and calculate the update to the influence of an element of static data; e.g. for C : C score ← C score + ω ( b ) × ln(1 + 1 1 2 × deg in ( b ) ) Thomas, Chothia, Garcia Stringer ESORICS 2017 32 / 41

Computation of Function Score The score assigned to a function is the sum of the scores assigned to the static data that influences its branching. From the previous example: f score = A score + B score + C score Thomas, Chothia, Garcia Stringer ESORICS 2017 33 / 41

Results & Evaluation Thomas, Chothia, Garcia Stringer ESORICS 2017 34 / 41

Hard-coded Credentials in Ray Sharp DVR Firmware Identification of hard-coded credential pair in Ray Sharp DVR firmware: Comparison Function Score 5170 . 30 strcmp sub 1C7EC ( strcmp wrapper) 1351 . 96 1109 . 73 strncmp 353 . 93 strstr 222 . 00 memcmp (1) (2) Label Score Static Data Function Depends 1 30 . 23 664225 strcmp { [] } 2 2 . 77 { [ 664225 ] } root strcmp Thomas, Chothia, Garcia Stringer ESORICS 2017 35 / 41

Hard-coded Credentials in Q-See DVR Firmware Identification of a hard-coded credential backdoor in DVR firmware – different behaviour for each hardcoded password: Comparison Function Score 1464 . 70 strcmp (1) 779 . 33 strncmp (5) CRYPTO malloc (FP) 685 . 10 (2) ZNKSs7compareEPKc 376 . 20 (3) 306 . 00 strstr 196 . 00 (6) strcasecmp (4) Label Score Static Data Function Depends (7) 1 171 . 39 { [] } admin strcmp 2 58 . 92 { [ admin ] } ppttzz51shezhi strcmp 3 45 . 13 { [ admin ] } 6036logo strcmp + 4 42 . 14 { [ admin ] } 6036adws strcmp 5 37 . 54 { [ admin ] } 6036huanyuan strcmp 6 35 . 21 { [ admin ] } 6036market strcmp 7 31 . 05 jiamijiami6036 strcmp { [ admin ] } Thomas, Chothia, Garcia Stringer ESORICS 2017 36 / 41

TrendNet HTTP Authentication with Hard-coded Credentials HTTP authentication check with comparison against hard-coded credential values: Comparison Function Score 1635 . 01 strcmp 481 . 20 strstr nvram get (FP) 413 . 10 strncmp 265 . 45 sub A2D0 (FP) 131 . 00 Static Data Score Function Depends 132 . 17 { . . . } emptyuserrrrrrrrrrrr strcmp 128 . 61 { [ . . . , emptyuserrrrrrrrrrrr ] } emptypasswordddddddd strcmp Thomas, Chothia, Garcia Stringer ESORICS 2017 37 / 41

Stringer: Measuring the Importance of Static Data Comparisons to - PowerPoint PPT Presentation

Stringer: Measuring the Importance of Static Data Comparisons to Detect Backdoors and Undocumented Functionality Sam L. Thomas , Tom Chothia, Flavio D. Garcia School of Computer Science University of Birmingham Birmingham United Kingdom B15

Building socket-aware BPF programs Joe Stringer Cilium.io Linux Plumbers 2018, Vancouver, BC

Static and Method Overloading static One per class, not per object static variables

Scaling container policy management with kernel features Joe Stringer Cilium.io Linux Plumbers

Jonathan Sharples jonathan.sharples@eefoundation.org.uk Eleanor Stringer

Static and dynamic verification Static and dynamic V&V Software inspections Concerned

Measuring variable importance in random forests Variable Variable importance in RF importance

Mining Data that Changes 17 July 2015 Data is Not Static Data is not static New

The Importance of The Importance of The Importance of The Importance of Mechanical Insulation

1 Static Equilibrium From Static Eq. to Dynamic Eq. System of mass points Static

STARTS: STARTS: STARTS: STARTS: STAtic STAtic Regression Test Selection Regression Test

static vs automatic storage classes Three types of memory allocations static storage class

Wrap Up Static, Packages, Exceptions Static methods // Example: // Java's built in Math class

Learning a Static Analyzer from Data Pavol Bielik Veselin Raychev Martin Vechev Department of

Instructional Strategies for Meeting the Needs of SWD and their Deficits February 25-27, 2019

Series: The Nations Next Generation Geostationary Weather Satellites Mike Stringer GOES-R

FloCon 2018 Tucson AZ Analysis of DNS Traffic on the Network EDGE, and In Motion. Fred Stringer

Deploying Dynamic Analyses and Preventing Compiler Backdoors with Multi-Version Execution Lus

Instruction-Level Steganography for Covert Trigger-Based Malware Dennis Andriesse and Herbert Bos

Lattice polynomial functions and their use in qualitative decision making AAA83 . . . . .

Analogy-Based Preference Learning with Kernels Mohsen Ahmadi Fahandar , Eyke Hllermeier

HOMSC14 WG-5, Low Level RF, Controls and System Integra@on

Lecture 8 Backward Induction 14.12 Game Theory Muhamet Yildiz 1 Road Map 1. Backward Induction

Complexity of backward induction games Jakub Szymanik October 17, 2012 Outline Introduction

A note on the complexity of backward induction games Jakub Szymanik RAIN @ NASSLLI 2012 Outline

Stringer: Measuring the Importance of Static Data Comparisons to - PowerPoint PPT Presentation

Stringer: Measuring the Importance of Static Data Comparisons to Detect Backdoors and Undocumented Functionality Sam L. Thomas , Tom Chothia, Flavio D. Garcia School of Computer Science University of Birmingham Birmingham United Kingdom B15

Building socket-aware BPF programs Joe Stringer Cilium.io Linux Plumbers 2018, Vancouver, BC

Static and Method Overloading static One per class, not per object static variables

Scaling container policy management with kernel features Joe Stringer Cilium.io Linux Plumbers

Jonathan Sharples jonathan.sharples@eefoundation.org.uk Eleanor Stringer

Static and dynamic verification Static and dynamic V&amp;V Software inspections Concerned

Measuring variable importance in random forests Variable Variable importance in RF importance

Mining Data that Changes 17 July 2015 Data is Not Static Data is not static New

The Importance of The Importance of The Importance of The Importance of Mechanical Insulation

1 Static Equilibrium From Static Eq. to Dynamic Eq. System of mass points Static

STARTS: STARTS: STARTS: STARTS: STAtic STAtic Regression Test Selection Regression Test

static vs automatic storage classes Three types of memory allocations static storage class

Wrap Up Static, Packages, Exceptions Static methods // Example: // Java's built in Math class

Learning a Static Analyzer from Data Pavol Bielik Veselin Raychev Martin Vechev Department of

Instructional Strategies for Meeting the Needs of SWD and their Deficits February 25-27, 2019

Series: The Nations Next Generation Geostationary Weather Satellites Mike Stringer GOES-R

FloCon 2018 Tucson AZ Analysis of DNS Traffic on the Network EDGE, and In Motion. Fred Stringer

Deploying Dynamic Analyses and Preventing Compiler Backdoors with Multi-Version Execution Lus

Instruction-Level Steganography for Covert Trigger-Based Malware Dennis Andriesse and Herbert Bos

Lattice polynomial functions and their use in qualitative decision making AAA83 . . . . .

Analogy-Based Preference Learning with Kernels Mohsen Ahmadi Fahandar , Eyke Hllermeier

HOMSC14 WG-5, Low Level RF, Controls and System Integra@on

Lecture 8 Backward Induction 14.12 Game Theory Muhamet Yildiz 1 Road Map 1. Backward Induction

Complexity of backward induction games Jakub Szymanik October 17, 2012 Outline Introduction

A note on the complexity of backward induction games Jakub Szymanik RAIN @ NASSLLI 2012 Outline

Static and dynamic verification Static and dynamic V&V Software inspections Concerned