C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
ON THE APPLICABILITY OF BINARY CLASSIFICATION TO DETECT MEMORY ACCESS ATTACKS IN IOT
ON THE APPLICABILITY OF BINARY CLASSIFICATION TO DETECT MEMORY ACCESS - - PowerPoint PPT Presentation
ON THE APPLICABILITY OF BINARY CLASSIFICATION TO DETECT MEMORY ACCESS ATTACKS IN IOT C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18 SOMMAIRE IoT node Related works Problem statement Proposed methodology Results Take out and
C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
ON THE APPLICABILITY OF BINARY CLASSIFICATION TO DETECT MEMORY ACCESS ATTACKS IN IOT
| 2
SOMMAIRE
C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
IoT node Take out and lessons learned Results Proposed methodology Problem statement Related works
| 3 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
“ The interconnection via the internet of computing devices embedded in everyday
WHAT’S AN IOT NODE
and can communicate via the internet
processing time, cost, …
the fish tank attack) limits their abilities to handle encryption or other data security functions updates/security patches may be difficult or impossible. any unpatched vulnerabilities will stay for very long
Ronen, Eyal, et al. "IoT goes nuclear: Creating a ZigBee chain reaction." Security and Privacy (SP), 2017 IEEE Symposium on. IEEE, 2017.
| 4 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
EDGE-NODE VULNERABILITIES: WHAT COULD POSSIBLY GO WRONG?
| 5 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
access attacks?
malicious tasks memory accesses
inside the device
EDGE-NODE VULNERABILITIES: WHAT COULD POSSIBLY GO WRONG?
| 6 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
EXISTENT COUNTERMEASURES: PREVENT VS PROTECT
Fuses and flash readout protection
that permits access to memory (post deployment upgrades) Encryption
and confidentiality
security compromise Detection
defense
| 7 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
frequency of access to a particular memory region (regardless of which component accessed it) during a time interval. The MHM is then combined with an image recognition algorithm to detect any anomalies.
nominal MHM
R&W: MEMORY DETECTION (1/2)
Yoon, Man Ki, et al, “Memory heat map: anomaly detection in real-time embedded systems using memory behavior”. In Design Automation Conference (DAC), 2015 52nd ACM/EDAC/IEEE (pp. 1-6). IEEE.
| 8 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
collected during legitimate executions of a sanitized system, combined by a clustering algorithm (k-means). If an
flagged as malicious)
R&W: MEMORY DETECTION (2/2)
Yoon, Man-Ki, et al. "Learning execution contexts from system call distribution for anomaly detection in smart embedded system." Proceedings of the Second International Conference on Internet-of-Things Design and Implementation. ACM, 2017.
| 9 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
performance counters, control flow, instruction mix, etc.)
memory access attacks in the context of a low cost IoT node PROBLEM STATEMENT
| 10 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
METHODOLOGY:
| 11 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
USE CASE PRESENTATION: CONNECTED THERMOSTAT
1 minute 10 seconds / wake up signal Variables stored into RAM Heating regulation loop Temperature measurement Temperature 10 seconds User action buttons
Mode Screen display Interrupts Send data to heating device Heat power Wake up signal event Internal variables
| 12 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
suspicious
accesses, cycles between consecutive reads, address increment, number of “unknown” (first- encountered) addresses, amount of read/accessed data …
IN MORE DETAILS
Time window Detected! Processor/ memory trace Feature extraction & selection Machine learning method Evaluation and trade-
| 13 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
and memory space
Attacker assumed to be aware of the presence of some security monitor avoid obvious change in the memory patterns of the device
two consecutive reads is incremented by constant (BD(cts)), linearly (BD(lin)) or randomly (BD(rand))
(NG(lin)) or randomly (NG(rand))
ATTACK SCENARIOS
| 14 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
TRAINING & TESTING DATASETS
dataset Training Testing Experi ment1 Nominal+ CD DB and NG Experi ment 2 (1) Nom+(CD+NG+BD) (2) Nom+(CD+NG) (3) Nom+(CD+BD) (1) Nom+ (CD+NG+BD)* (2) Nom+BD (3) Nom+ NG
| 15 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
EXTRACTED FEATURES
Processor/ memory trace Feature extraction & selection Machine learning method Evaluation and trade-
interval
between two consecutive reads in time interval
time interval
accessed during a time interval
| 16 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
knowledge gathered during the training
Bayes, linear discriminant analysis and quadratic discriminant analysis
CLASSIFICATION
x
y
Processor/ memory trace Feature extraction & selection Classifiers Evaluation and trade-
| 17 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
each class, and (2) pick the most probable. 𝑄 𝑑
𝑘 𝑦 = 𝑄 𝑦 𝑑 𝑘 𝑄(𝑑 𝑘)
𝑞(𝑦)
NAÏVE BAYESIAN MODEL
Posterior probability Likelihood Predictor prior probability Class prior probability
| 18 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
matrices are identical
𝑙 𝑦 is the estimated discriminant score that the observation will fall in the kth class based on the value of the predictor variable x
matrix that is common to all K classes
𝑙 is the prior probability that an observation belongs to the kth class
discriminant score 𝜀 𝑙 𝑦 is the largest, LINEAR DISCRIMINANT ANALYSIS
𝜀 𝑙 𝑦 = 𝑦𝑈Σ−1𝜈 𝑙 - ½ 𝜈 𝑙
𝑈 − 𝜈
𝑙 + lo g (𝜌 𝑙)
| 19 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
QUADRATIC DISCRIMINANT ANALYSIS
matrices
𝜀 𝑙 𝑦 = 𝑦𝑈Σ−1𝜈 𝑙 − ½ 𝜈 𝑙
𝑈 − 𝜈
𝑙 + lo g (𝜌 𝑙)
𝑙 𝑦 is the estimated discriminant score that the observation will fall in the kth class based on the value of the predictor variable x
matrix that is common to all K classes
𝑙 is the prior probability that an observation belongs to the kth class
discriminant score 𝜀 𝑙 𝑦 is the largest,
| 20 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
K NEAREST NEIGHBOR KNN
majority voting
| 21 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
DECISION TREE
decision
| 22 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
RANDOM FOREST
producing a response when presented with a set of predictor values.
| 23 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
data of both classes as possible We should maximize the margin 𝑛 =
1 | 𝑥 |
subset of the data points (support vectors).
It will be useful computationally if only a small fraction of the
data points are support vectors, because we use the support vectors to decide which side of the separator a test case is on.
SUPPORT VECTOR MACHINE
| 24 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
SUPPORT VECTOR MACHINE : SOFT MARGIN
1 Slack variables ξi can be added to allow misclassification of difficult or noisy examples.
Input space Input space
ξj ξi
| 25 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
SUPPORT VECTOR MACHINE : KERNEL SVM
2 Projection to higher dimensional space where we can find a linear separator
Input space
f(.)
Feature space
f( ) f( ) f( ) f( ) f( ) f( ) f( ) f( ) f( ) f( ) f( ) f( ) f( ) f( ) f( ) f( ) f( ) f( ) f( ) f( ) f( ) f( ) f( ) f( )
| 26 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
𝒈 𝒚𝒖𝒇𝒕
𝒖= 𝒕𝒋𝒉 𝒐
(𝒄 + 𝜷𝒕𝒛𝒕𝑳 𝒚𝒖𝒇𝒕
𝒖
,𝒚𝒕 )
𝒕∈SV
vectors that maximize the margin and computing the weight to use on each support vector.
set the parameters of the used kernel
𝑈𝑦𝑘 + 1)𝑒
p (−𝛿 𝑦𝑗 −𝑦𝑘
2)
KERNERL SVM The set of support vectors
Lagrange parameter
| 27 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
HOW CAN WE BUILD THE DETECTOR ? TRAINING ON CLASSIC DUMP ATTACK
Processor/ memory trace Feature extraction & selection Machine learning method Evaluation and trade-
| 28 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
𝑮𝑸 𝑺 = 𝑮𝑸 𝑮𝑸 + 𝑼𝑶
𝑮𝑶𝑺 =
𝑮𝑶 𝑮𝑶+𝑼𝑸
𝑸𝑸 𝑾 = 𝑼𝑸 𝑼𝑸 + 𝑮𝑸
instance)
EVALUATION METRICS
Processor/ memory trace Feature extraction & selection Classifiers Evaluation and trade-
| 29 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
Classifier Add(+)/Sub(- )/comparison Mul (×) Sqrt() Exp() Div
LSVM 𝑒 + 1 𝑜𝑡 − 1 𝑒 + 2 𝑜𝑡 RSVM 𝑒 + 1 𝑜𝑡 𝑒 + 3 𝑜𝑡 𝑜𝑡 𝑜𝑡 KNN 2𝑜 𝑒 + 1 − 2 × 𝑙 𝑜 × 𝑒 𝑜 LDA 𝑒 𝑒 QDA 𝑒2 + 𝑒 𝑒2 + 2d Naïve Bayes 𝑜𝑑 2d 𝑒 𝑒 2𝑒 Random Forest 𝑜_𝑢𝑠𝑓 𝑓 (ℎ + 1) Decision Tree ℎ
COMPUTATIONAL COST FOR CLASSIFYING ONE INSTANCE Principle
predicting the label of one instance for each classifier, we decomposed the learnt decision function of each classifier to basic arithmetic
multiplications, square root; exponential and divisions)
the number of variables needed by each classifier
d: number of features ns: number of support vectors n: number of observations in training dataset k: number of neighbors nc: number of classes h: depth of the tree ntree: number of trees in the random forest
| 30 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
DETECTION PRECISION & LEAKAGE OF CLASSIFIERS TRAINED ON CLASSIC DUMP
| 31 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
DETECTION PRECISION & LEAKAGE OF CLASSIFIERS TRAINED ON VARIANTS DUMPS
| 32 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
CLASSIFIERS PERFORMANCE (TRAINED ON CLASSIC DUMP)
| 33 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
CLASSIFIERS PERFORMANCE (TRAINED ON DUMP VARIANTS)
| 34 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
COMPARISON OF CLASSIFIERS PERFORMANCE
Trained on classic dump Trained on dump variants
| 35 C&ESAR 2018- Rennes | CEA Leti | KERROUMI Sanaa | 08/11/18
bytes leakage and detection accuracy around 90%)
TAKE OUT AND NEXT STEP …
Leti, technology research institute Commissariat à l’énergie atomique et aux énergies alternatives Minatec Campus | 17 rue des Martyrs | 38054 Grenoble Cedex | France www.leti.fr
Contact us for more details: Sanaa.kerroumi@cea.fr Anca.Molnos@cea.fr Damien.Couroussé@cea.fr