The Curse of Class Imbalance and Conflicting Metrics with Machine Learning for Side-channel Evaluations
Stjepan Picek, Annelie Heuser, Alan Jovic, Shivam Bhasin, and Francesco Regazzoni
The Curse of Class Imbalance and Conflicting Metrics with Machine - - PowerPoint PPT Presentation
The Curse of Class Imbalance and Conflicting Metrics with Machine Learning for Side-channel Evaluations Stjepan Picek, Annelie Heuser, Alan Jovic, Shivam Bhasin, and Francesco Regazzoni Big Picture side-channel measurements device classifier
Stjepan Picek, Annelie Heuser, Alan Jovic, Shivam Bhasin, and Francesco Regazzoni
device (training) plaintext side-channel measurements labels classifier (training) profiled model device (attacking) plaintext side-channel measurements classifier (attacking) evaluation metric
device (training) plaintext side-channel measurements labels classifier (training) profiled model device (attacking) plaintext side-channel measurements classifier (attacking) evaluation metric
template building template evaluation + max likelihood success rate guessing entropy
device (training) plaintext side-channel measurements labels classifier (training) profiled model device (attacking) plaintext side-channel measurements classifier (attacking) evaluation metric
template building template evaluation + max likelihood success rate guessing entropy ML training ML testing accuracy
device (training) plaintext side-channel measurements labels classifier (training) profiled model device (attacking) plaintext side-channel measurements classifier (attacking) evaluation metric
template building template evaluation + max likelihood success rate guessing entropy ML training ML testing accuracy 1. 2.
keys
used
possible 8-bit values:
HW not HW not HW
that are “designed” to maximise accuracy
accuracy of 27%
not give any information for SCA
Class 1 Class 2 7 samples 13 samples
to the least populated class
unused samples
to the least populated class
unused samples
Class 1 Class 2 7 samples 7 samples
Class 1 Class 2 7 samples 13 samples
from the original dataset until amount is equal to largest populated
context comparable to other methods
samples are not selected at all
Class 1 Class 2 “13” samples 13 samples 2 3 3 2 2 1
from the original dataset until amount is equal to largest populated
context comparable to other methods
samples are not selected at all
Class 1 Class 2 7 samples 13 samples
class instances
(corresponding to Euclidean distance)
class instances
(corresponding to Euclidean distance)
Class 1 Class 2 13 samples 13 samples
Oversampling Technique with Edited Nearest Neighbor
class different from multiple neighbors
Class 1 Class 2 7 samples 13 samples
Oversampling Technique with Edited Nearest Neighbor
class different from multiple neighbors
Class 1 Class 2 10 samples 10 samples
the implementation / dataset / distribution to balance datasets
W board
(Rotating SBox Masking)
mask assumed known
Virtex-5 FPGA of a SASEBO GII evaluation board.
github: https://github.com/ AESHD/AES HD Dataset
delay countermeasure => misaligned
microcontroller
github: https:// github.com/ ikizhvatov/ randomdelays-traces
, TA:
stable profiles (lower #measurements for profiling)
dataset leads to better performance
device (training) plaintext side-channel measurements labels classifier (training) profiled model device (attacking) plaintext side-channel measurements classifier (attacking) evaluation metric
template building template evaluation + max likelihood success rate guessing entropy ML training ML testing accuracy 1. 2.
probability of success
secret key rank
attacking phase
experiments
probability (percentage)
experiments
probability of success
secret key rank
attacking phase
experiments
probability (percentage)
experiments
No translation
probability (percentage)
experiments
probability of success
secret key rank
attacking phase
experiments
indication: if acc high, GE/SR should "converge quickly”
Global acc vs class acc
function between class and key (e.g. class involved the HW)
classify more unlikely values in the class may be more significant than others
all class values
Label vs fixed key prediction
than 1 trace
considered independently (along #measurements)
fixed key, accumulated over #measurements
low SR/GE
more details, formulas, explanations in the paper…
techniques than collect more imbalanced samples
evaluation! ✴ global vs class accuracy ✴ label vs fixed key prediction