CA CAME MELab ab Motivation Host NAND Flash density (pages / - - PowerPoint PPT Presentation

ca came melab ab
SMART_READER_LITE
LIVE PREVIEW

CA CAME MELab ab Motivation Host NAND Flash density (pages / - - PowerPoint PPT Presentation

Platform-Agnostic Lightweight Deep Learning for Garbage Collection Scheduling in SSDs Junhyeok Jang, Donghyun Gouk, Jinwoo Shin, Myoungsoo Jung Computer Architecture and Memory systems Laboratory CA CAME MELab ab Motivation Host NAND


slide-1
SLIDE 1

Computer Architecture and Memory systems Laboratory

CA CAME MELab ab

Junhyeok Jang, Donghyun Gouk, Jinwoo Shin, Myoungsoo Jung

Platform-Agnostic Lightweight Deep Learning for Garbage Collection Scheduling in SSDs

slide-2
SLIDE 2

CAMELab ab

Motivation

2

Host

..

Garbage Collection

SSD

NAND Flash density (pages / block)

More GC overhead

Delay

How to hide GC latency?

  • Let’s perform GCs at user idle times!

How long will be the user idle times?

slide-3
SLIDE 3

CAMELab ab

Hiding GC latency : Background GC

3

SSD

I/O Garbage Collection Time Wait

Common Assumption:

Storage won’t be touched in the near future! Request Threshold 1 10 100 1k 10us 10ms 10s Idle Time

# Request

slide-4
SLIDE 4

CAMELab ab I/O

Hiding GC latency : Background GC

*Real workload from MS Production Server (https://trace.camelab.org/) 4

SSD

Garbage Collection Time Wait Request

1 100 10k 10s 10ms 10us

Assumption Real workload*

I/O Delay! 1 10 100 1k 10us 10ms 10s

# Request

Threshold

slide-5
SLIDE 5

CAMELab ab

GC-Tutor

5

DNN-based GC scheduler

  • Precisely predict future request arrivals
  • Schedules GC in user-invisible time
  • Consistently accurate regardless of workload

with lightweight online learning mechanism

slide-6
SLIDE 6

CAMELab ab

DNN-based GC Scheduling

6

I/O Pattern DNN-based Idle Time Prediction Background GC

Problem :

A fixed DNN model fails to predict unseen workloads

DNN Model

Timestamp, R/W, seq/rand, size

Idle time

slide-7
SLIDE 7

CAMELab ab

DNN-based GC Scheduling

7

Problem :

A fixed DNN model fails to predict unseen workloads

Online Learning!

I/O Pattern DNN-based Idle Time Prediction Background GC

1D-CNN Model

Timestamp, R/W, seq/rand, size

Idle time

slide-8
SLIDE 8

CAMELab ab

Lightweight Online Learning

*Chelsea Finn, et al., Model Agnostic Meta Learning for Fast Adaptation of Deep Networks, ICML 2017 8

25 50 75 100

Accuracy (%) Deeplearn GCTutor

Offline Online

Meta Learning I/O traces

Online Learning wdev prxy stg

Meta Learning* Naïve

Takes more than a few hours

Infeasible!

slide-9
SLIDE 9

CAMELab ab

Evaluation

Traces from CAMELab Trace(https://trace.camelab.org/) 9

C F S 2 4 H R B S D D R D A P

  • n

l i n e w e b m a i l r s r c h 10 60 70 80 90 100

Accuracy (%)

Train set prn proj prxy stg wdev 20 40 60 80 100

Accuracy False short False long

Unseen set

Left: GCTutor Right: Deep

GC-Tutor can accurately predict idle time

  • Consistently higher accuracy on trained workloads
  • Significantly higher accuracy on unseen workloads
  • prxy, stg :

Very different idle time distribution compared to trained workloads

GC-Tutor can reduce GC-induced delays by 82.4%,

  • n average, compared to rule-based GC scheduler
slide-10
SLIDE 10

CAMELab ab

Conclusion : GC-Tutor

10

DNN-based GC scheduler

  • Accurate request arrival prediction using DNN model
  • Meta learning-based light-weight online learning mechanism

Making GC overhead invisible to users!

Offline Online

Meta Learning

I/O traces

Online Learning

25 50 75 100

Accuracy (%) GCTutor Deeplearn

wdev prxy stg

slide-11
SLIDE 11

CAMELab ab

Thank you!

Junhyeok Jang Electrical Engineering, KAIST

11