Detecting Hardware Trojans: A Tale of Two Techniques Sharad Malik - - PowerPoint PPT Presentation

detecting hardware trojans a tale of two techniques
SMART_READER_LITE
LIVE PREVIEW

Detecting Hardware Trojans: A Tale of Two Techniques Sharad Malik - - PowerPoint PPT Presentation

Detecting Hardware Trojans: A Tale of Two Techniques Sharad Malik sharad@princeton.edu FMCAD 2015 Hardware Security and Hardware Trojans User apps Each layer trusts all layers below it Kernel Hypervisor More privilege Widely


slide-1
SLIDE 1

Detecting Hardware Trojans: A Tale of Two Techniques

Sharad Malik sharad@princeton.edu FMCAD 2015

slide-2
SLIDE 2

Hardware Security and Hardware Trojans

User apps Kernel Hypervisor Firmware Hardware A Hardware Trojan is a malicious intentional modification of an electronic circuit or design, resulting in undesired behavior Each layer trusts all layers below it

  • More privilege
  • Widely used platforms
  • Difficult to patch

 more damage

2

slide-3
SLIDE 3

Where are the Vulnerabilities?

Specification Design Mask Fab Wafer Probe Package Test IP Tools Std Cells Models Deploy

[Source: Brian Sharkey, TRUST in Integrated Circuits Program: Briefing to Industry, DARPA MTO, 26 March 2007]

Trusted Untrusted

3

slide-4
SLIDE 4

A Real Threat?

Before/after pictures of a suspected nuclear reactor site Suspicion that a hardware backdoor was exploited to disable the radar system

[Sally Adee, The Hunt for the Kill Switch, IEEE Spectrum May 2006] [John Markoff, Old Trick Threatens the Newest Weapons, NY Times, 26 October 2009]

4

slide-5
SLIDE 5

5

Malicious circuits in a design

slide-6
SLIDE 6

Acknowledgements

6

  • Bruno Dutertre
  • Adria Gascon
  • Dejan Jovanovic
  • Maheen Samad
  • Natarajan Shankar
  • Ashish Tiwari

SRI

  • Burcin Cakir
  • Kanika Pasricha
  • Dillon Reisman
  • Pramod Subramanyan
  • Adriana Susnea
  • Nestan Tsiskaridze

Princeton

  • Wenchao Li
  • Sanjit Seshia
  • Wei Yang Tan

UC Berkeley

DARPA IRIS Project Center for Future Architectures Research (C‐FAR)

  • Burcin Cakir
  • Pramod Subramanyan
slide-7
SLIDE 7

7

Whitelist Blacklist Logical Analysis Statistical Analysis

slide-8
SLIDE 8

Netlist Analysis Portfolio

Netlist Netlist

Common‐support analysis K‐cut matching Aggregation Word propagation Module generation Library Matching Multibit Register Analysis RF analysis Counter analysis Shift register analysis Overlap Resolution Functional Simulation Statistical Correlation (Weight Computation) Normalization/Clustering Trojan Detection using Reachability Plots Abstracted Netlist Abstracted Netlist

Reverse engineering using static analyses

8

Logical Analysis Statistical Analysis

slide-9
SLIDE 9

Logical Analysis for Reverse Engineering

9

slide-10
SLIDE 10

Reverse Engineering Objective

10

ALU Register File

MUX MUX

  • Instr. Decoder

Source: http://miscpartsmanuals2.tpub.com/TM‐9‐1240‐369‐34/TM‐9‐1240‐369‐340115.htm

Extract high‐level components from an unstructured and flat netlist

slide-11
SLIDE 11

Reverse Engineering Portfolio

Netlist Netlist

Common‐support analysis K‐cut matching Aggregation Word propagation Module generation Library Matching Multibit Register Analysis RF analysis Counter analysis Shift register analysis Overlap Resolution Abstracted Netlist Abstracted Netlist Combinational component analyses Sequential component analyses 1. Reverse Engineering Digital Circuits Using Functional Analysis, [DATE’13] 2. Reverse Engineering Digital Circuits Using Structural and Functional Analysis, [TETC’14] 3. Wordrev: Finding word‐level structures in a sea of bit‐level gates, [HOST’13] 4. Template‐based circuit understanding, [FMCAD’14]

11

slide-12
SLIDE 12

General Strategy

12

Identify Potential Module Boundaries BDD/SAT‐Based Analyses to Verify Functionality Output Inferred Modules

mux? mux Main Challenge: Netlist is a sea of gates! No information about the boundaries of modules inside it!

slide-13
SLIDE 13

Bitslice Identification and Aggregation

Netlist Netlist

K‐cut matching Aggregation Combinational component analyses Sequential component analyses

13

Multiplexers, decoders, demultiplexers, ripple carry adders and subtractors, parity trees, …

slide-14
SLIDE 14

, ,

Bitslice Identification using Cut‐based Matching

14

,

Cong and Ding, FlowMap, [TCAD’94] Chatterjee et al., Reducing Structural Bias in Technology Mapping, [ICCAD’05]

  • Cuts are computed recursively
  • Made tractable by enumerating cuts with k ≤ 6 inputs
  • Group cuts into equivalence classes using permutation independent comparison
  • BDDs used to represent Boolean functions during matching
slide-15
SLIDE 15

Bitslice Aggregation

15

Group Bitslices With Shared Signals Group Bitslices With Cascading Signals

slide-16
SLIDE 16

Word Propagation and Module Matching

16

Netlist Netlist

K‐cut matching Aggregation Word propagation Module generation Library Matching Combinational component analyses Sequential component analyses

slide-17
SLIDE 17

Word Propagation and Module Generation

17

Once multibit structures blocks are found, larger bit slices can be identified by forward and backward traversal of the circuit.

Given an “output” word, we can traverse backwards to closely‐related words to find candidate modules Given an “output” word, we can traverse backwards to closely‐related words to find candidate modules

slide-18
SLIDE 18

Library Matching

18

Candidate module Library module Match candidate modules against a library of common modules such as adders, ALUs, … Challenges

  • Permutation and polarity of inputs
  • Setting of control inputs

A B c B A QBF Formulation: Does there exist some setting of the control inputs, and some ordering of the inputs such that for all input values, the candidate and the library module produce the same

  • utputs?

[FMCAD ‘14]

slide-19
SLIDE 19

Library Matching as QBF

19

M Control signals c k n Data inputs X Π n

Permutation Network

Permutation p L n

∃, ∀: Π , X , c ≡ LX

Signatures are used to restrict the search space for the permutations m m [FMCAD ‘14]

Mohnke and Malik, Permutation and Phase Independent Boolean Comparison, [Integration ‘93]

slide-20
SLIDE 20

Identifying Register Files

Netlist Netlist

Common‐support analysis K‐cut matching Aggregation Word propagation Module generation Library Matching RF analysis Combinational component analyses Sequential component analyses

20

slide-21
SLIDE 21

The Structure of a Register File

21

Register File Register File Write data Write addr + write enable Read address Read data

Register file consists of:

  • Flip‐flops that store information
  • Read logic: takes a read address and outputs stored data
  • Write logic: stores data in the register file
slide-22
SLIDE 22

Identifying Read Logic

22

FF FF FF FF FF FF FF FF

dataout

addr[2] addr[1] addr[0]

Insight: look for trees of logic where the leaves of the tree are flip‐flops

slide-23
SLIDE 23

Verifying Identified Read Logic

23

FF FF FF FF FF FF FF FF

dataout

addr[2] addr[1] addr[0]

  • Verify there exists some address which propagates each flip‐flop
  • utput to the data output
  • This is done using a BDD‐based analysis
slide-24
SLIDE 24

Identifying Write Logic

24

  • Muxes select between current value and write data
  • Decoders select the location that is being written to
  • Easy to find muxes and decoders after we find the flip‐flops
slide-25
SLIDE 25

Overlap Resolution

Netlist Netlist

Common‐support analysis K‐cut matching Aggregation Word propagation Module generation Library Matching Multibit Register Analysis RF analysis Counter analysis Shift register analysis

25

Combinational component analyses Sequential component analyses Overlap Resolution Abstracted Netlist Abstracted Netlist

slide-26
SLIDE 26

Problem: Inferred Modules Overlap

26

FF FF FF FF FF FF FF FF

dataout

addr[2] addr[1] addr[0]

Inferred register file 4‐bit MUX

slide-27
SLIDE 27

Resolving Overlaps

Formulate an Integer‐Linear Program 1. Constraints specify that modules must not overlap 2. Objective is one of the following

  • Maximize the number of covered gates OR
  • Minimize the number of modules given a coverage target

27

slide-28
SLIDE 28

Experimental Setup

  • Implemented in C++
  • MiniSAT 2.2
  • CUDD 2.4
  • CPLEX 12.5

Toolchain

  • Many from OpenCores.org
  • Size ranges from few hundred to several thousand gates
  • ITAG1B: 375k gate test case from DARPA

Designs

28

slide-29
SLIDE 29

Summarizing Inference Results (1/2)

29

  • 45‐90% of the gates in these are covered
  • Runtime is a maximum of a several minutes
slide-30
SLIDE 30

Summarizing Inference Results (2/2)

30

  • Covered ~70% of the large test article (375k gates)
  • Split the up big design into 7 subcomponents using reset tree; Covered 60‐87%
  • Entire analysis terminates in an hour
slide-31
SLIDE 31

Summarizing the Reverse Engineering Efforts

Netlist Netlist

Common‐support analysis K‐cut matching Aggregation Word propagation Module generation Library Matching Multibit Register Analysis RF analysis Counter analysis Shift register analysis Overlap Resolution Abstracted Netlist Abstracted Netlist Combinational component analyses Sequential component analyses

A portfolio of inference algorithms to identify word‐level modules from a flat unstructured netlist! A portfolio of inference algorithms to identify word‐level modules from a flat unstructured netlist!

31

slide-32
SLIDE 32

Statistical Analysis of Suspicious Logic

32

slide-33
SLIDE 33

Signal Correlation‐Based Clustering: Overview

Netlist Netlist

Functional Simulation Statistical Correlation (Weight Computation) Normalization/Clustering Trojan Detection using Reachability Plots

33

An information‐theoretic approach for Trojan detection

  • Estimate statistical correlation between

signals in a design using simulation data

  • Use this estimate in a clustering algorithm

to isolate Trojan logic

  • Estimate statistical correlation between

signals in a design using simulation data

  • Use this estimate in a clustering algorithm

to isolate Trojan logic

Cakir and Malik, “Hardware Trojan Detection for Gate‐level ICs Using Signal Correlation Based Clustering,” DATE 2015 [Best Paper Award]

slide-34
SLIDE 34

Intuition

34

Trojan has weak statistical correlation with the rest of the circuit

slide-35
SLIDE 35

Functional Simulation‐based Statistical Correlation

Example Trojan Circuit

35

i2 i3 i4 i5 i6 i1 T Trojan trigger w1 w2

Weight Computation

  • Use existing/new testbenches for

functional tests

  • Generate digital stimuli on

different regions of the circuit

Target: excite the circuit as much as possible to estimate the statistical correlation between neighboring nodes in the circuit Target: excite the circuit as much as possible to estimate the statistical correlation between neighboring nodes in the circuit

slide-36
SLIDE 36

Functional Simulation‐based Statistical Correlation

36

i1 i2

  • 1

Simulation waveforms generated with functional tests

f=< 0, 0, 0, 0, 0, 1, 1, 0, … > g=< 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, … > h=< 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, … >

Obtaining new signals from simulation waveforms Weight of an input/output pair is the energy of the cross‐correlation signal

1 ∗ 2 ∗

slide-37
SLIDE 37

Weight Normalization and Clustering

Weight normalization

  • Degree of a node is

important to identify hubs and outliers

  • Normalize weights based
  • n node degrees
  • obtain new metric σ
  • Hubs have high degrees
  • Keeps σ across a cluster

small

37

Two structure‐connected clusters, with one hub and two outliers

[Jianbin Huang et al., IEEE Transactions on Knowledge and Data Engineering, Aug. 2013]

slide-38
SLIDE 38

Weight Normalization and Clustering

38

i2 i3 i4 i5 i6 i1 T σ1 σ2

Example Trojan

slide-39
SLIDE 39

Weight Normalization and Clustering

39

Good circuit

i2 i3 i4 i5 i6 i1 T σ1 σ2

σ1 > σ2

Trojan How does clustering help detect Trojans? ‐ Use OPTICS algorithm in practice, used in learning

slide-40
SLIDE 40

Clustering with Reachability Plots

40

2D Data Set Walk on dataset: An augmented order of dataset to reflect the clustering structure

Example data set:

  • Hierarchical clusters
  • f different sizes,

densities and shapes

slide-41
SLIDE 41

Clustering with Reachability Plots

41

2D Data Set Walk on dataset: An augmented order of dataset to reflect the clustering structure Reachability distance: measure of proximity to dense regions ‐ Starting point arbitrary ‐ Order points in increasing distance from current point Reachability Plot Distance based on 1/σ

  • High correlation, smaller distance
  • Across hub, larger distance

Our Application:

slide-42
SLIDE 42

Clustering with Reachability Plots

42

2D Data Set Reachability Plot

How useful is this for Trojan detection? How useful is this for Trojan detection?

slide-43
SLIDE 43

Trojan Detection based

  • n Reachability Plots

43

RS232‐800: UART core Trojan: Comparator in receiver circuit. Manipulates output signal. Trojan (TJ) logic distinguished from TX and REC

slide-44
SLIDE 44

Trojan Detection based

  • n Reachability Plots

44

AES‐1800: Encryption circuit Trojan: Drains the battery after observing a predefined input plaintext. Trojan (TJ) logic appearing as a separate cluster

slide-45
SLIDE 45

Evaluation Methodology

  • Eight TrustHub groups of

Verilog circuits

  • Synthesized using

Synopsys Design Compiler

  • IBM/ARM cell library
  • Synopsys TetraMAX

ATPG tool

  • Used if testbenches not

available

45

TrustHub Circuits Design Synthesis Simulation Trojan Detection Testbenches / TetraMAX Cell library

Trusthub benchmarks [http://www.trust-hub.org/resources/benchmarks]

slide-46
SLIDE 46

Sensitivity and Specificity Analysis

46

s35932‐200: ISCAS’89 benchmark Specificity: 1 ‐ False positive ratio, TPR: True positive ratio (Sensitivity), Probability Threshold: Confidence‐level parameter Specificity: 1 ‐ False positive ratio, TPR: True positive ratio (Sensitivity), Probability Threshold: Confidence‐level parameter

slide-47
SLIDE 47

Sensitivity and Specificity Analysis

47

Design Information Trojan Detection Name Gate/Latch SPC (%) TPR (%) s15850‐100 3478 99 61 s35932‐200 8107 99 27 s38417‐100 8422 99 100 s38584‐200 9548 99 99 AES‐1800 164800 98 92 wb‐conmax‐200 20224 96 28 PIC16F84‐100 1616 96 75 RS232‐800 205 94 80

Specificity: 1 ‐ False positive ratio, TPR: True positive ratio, Specificity: 1 ‐ False positive ratio, TPR: True positive ratio, At least a quarter of the nodes of each Trojan is identified

slide-48
SLIDE 48

Summary: Signal Correlation‐Based Clustering

  • Simulation‐based clustering technique to

detect hardware Trojans in gate‐level circuits

  • Methodology to find weakly‐correlated

nodes or functionally isolated sections in the netlist

  • Identify Trojan‐related nodes with low

false positive rates

  • Key observations
  • Do not attempt to find all Trojan logic

but flag a small subset of gates

  • Extensive test sets lead to higher

coverage and better statistics  Better results

48

i1 i2

  • 1

Good circuit

i2 i3 i4 i5 i6 i1 T σ1 σ2

Trojan

σ1 > σ2

slide-49
SLIDE 49

Conclusions

49

  • Portfolio of matching

algorithms for reverse engineering

  • Went much further

than we expected

  • Simulation data‐based

clustering very powerful

  • Applications beyond

Trojan detection?