Det Detec ectin ing An Anom omal alou ous Com omputat ation - - PowerPoint PPT Presentation

det detec ectin ing an anom omal alou ous com omputat
SMART_READER_LITE
LIVE PREVIEW

Det Detec ectin ing An Anom omal alou ous Com omputat ation - - PowerPoint PPT Presentation

Det Detec ectin ing An Anom omal alou ous Com omputat ation ion wit ith RN RNNs on on GP GPU-Ac Accel eler erat ated ed HPC PC Mac Machin ines es Pengfei Zou, Ro Rong Ge Clemson University Ang Li, Kevin Barker Pacific


slide-1
SLIDE 1

Det Detec ectin ing An Anom

  • mal

alou

  • us Com
  • mputat

ation ion wit ith RN RNNs on

  • n

GP GPU-Ac Accel eler erat ated ed HPC PC Mac Machin ines es

Pengfei Zou, Ro Rong Ge

Clemson University

Ang Li, Kevin Barker

Pacific Northwest National Laboratory

1

ICPP2020

slide-2
SLIDE 2

ICPP2020

4Th

The new threat in HPC

p Illicit workloads exploit powerful GPUs committed to HPC workloads

4Our approa

  • ach

p Leverage identifiable patterns of HPC workloads p Treat illicit workload detection as a classification problem p Devise RNN models to infer workloads from high-level profiles

4Con

  • ntribution
  • n

p An online illicit workload detection suitable for practical use

v > 95% accuracy, with system level light weight profiling only

p Techniques to handle data heterogeneity, irregularity and loss p Advanced RNN modeling for inference accuracy

Ov Over erview ew

2

slide-3
SLIDE 3

ICPP2020

4Illicit com

  • mputation
  • ns begin running on
  • n HPC systems

p Crypto mining p Password cracking p Denial-of-service (DoS) attacks

4Com

  • mmon
  • n characteristics

p For-profit or malicious attacks instead of science p Resource intensive

v Powerful GPU accelerators are ideal

p Long execution time: days to weeks or longer

4Risks and security issues to

  • HPC

p Mission-critical applications deprived of computing cycles p data leaking, system damage, etc p Empowered hacks and attacks

Il Illicit Applications on HPC Systems

3

slide-4
SLIDE 4

ICPP2020

4Penetrating log

  • gin nod
  • des impos
  • ses the risks

p HPC systems only protect login nodes

4Author

  • rized users can run illicit com
  • mputation
  • ns

p Authorization and authentication easily passed

4Lit

Little ba barrie iers and d gua uards ds exis ist

p Due to performance priority in HPC systems p Little or no network traffic monitoring and host auditing

4Com

  • mputation
  • ns masked and of
  • ffloa
  • aded to
  • accelerator
  • rs

p CPU-side monitoring and detection measures would fail

A Unique, New w Thread

4

Novel security measures needed to detect illicit computation in HPC

slide-5
SLIDE 5

ICPP2020

4HPC wor

  • rkloa
  • ads have unique patterns identifiable by ML

p A small set of programs with specific resource usage patterns p Certain kernels and functions, e.g., FFT, BLAS

4Accurate ML mod

  • dels use many HW cou
  • unters as input

p Large overhead for online detection p Intrusive to user applications

Op Opportun unities es and Challen llenges ges

5

slide-6
SLIDE 6

ICPP2020

4Online illicit wor

  • rkloa
  • ad detection
  • n

p Illicit GPU computation detection as classification problems p Light-weight, common system level profiling for model input p Multiple input sequences for inference accuracy p Synergistic multi-RNNs to handle complex, heterogeneous inputs

Our Our Approach

6

Periodic Aperiodic

slide-7
SLIDE 7

ICPP2020

4Heterog

  • geneity in data sequences

p Varying sample losses in resource utilization sequences p Asynchronism between the types

4Irregularity of

  • f event-ba

based data seque quence

Da Data Het eter erogen genei eity

7

slide-8
SLIDE 8

ICPP2020

4Nv

Nvidia-sm smi prof

  • filing los
  • ses samples

p E.g., 30% on average

4Los

  • sses depend on
  • n application
  • n and sampling interval

p Different temporal information from different training apps

Sa Sample Los Losses in in Util tiliz ization tion Data ta

8

slide-9
SLIDE 9

ICPP2020

4Split Layers for

  • r the event-ba

based driver run untime

4Interpol

  • lation
  • n layer for
  • r the resou
  • urce utilization
  • n sequences

LST LSTM La Layers for Advanc nced Training ining

9

slide-10
SLIDE 10

ICPP2020

4Wor

  • rkloa
  • ads

p 83 authorized applications

v Rodinia, Parboil, SHOC, PolyBench, exascale Proxy Apps, etc

p 17 unauthorized applications from GitHub and BitBucket

v Crypto mining, password cracking, brute force attacking…

4Data col

  • llection
  • n

p Periodic resource utilization

v Power, core utilization, memory footprint, memory bandwidth

p Event based driver runtime

v Kernel events: starting time, duration, configuration v Data transfer events: starting time, latency, direction, bandwidth

p HW performance counters for counterpart comparison

4Three generation

  • ns of
  • f GPUs: K40, P100, and V100

Mo Model Training and Va Validation

10

slide-11
SLIDE 11

ICPP2020

Selected Ev Evaluation Results

11

Accuracy False NR

  • vs. HMC based
slide-12
SLIDE 12

ICPP2020

4A

A ne new th thread in n HPC

p Illicit computation takes execution cycles and empowers attacks

4Our prop

  • pos
  • sed on
  • nline detection
  • n

p Lightweight profiling p Accurate detection with fused LSTMs using multiple data sequences

4Ou

Our r findings

p Illicit workloads have different patterns from HPC workloads p Multiple system-level profiling is sufficient for accurate detection p Fused RNNs are suitable for online detection

Co Conc nclus usio ion

12