Fourier-Assisted Machine Learning of Hard Disk Drive Access Time - PowerPoint PPT Presentation

Fourier-Assisted Machine Learning of Hard Disk Drive Access Time Models Adam Crume 1 Carlos Maltzahn 1 Lee Ward 2 Thomas Kroeger 2 Matthew Curry 2 Ron Oldfield 2 Patrick Widener 2 1 University of California, Santa Cruz { adamcrume, carlosm } @cs.ucsc.edu 2 Sandia National Laboratories, Livermore, CA { lee, tmkroeg, mlcurry, raoldfi, pwidene } @sandia.gov November 18, 2013 Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times

Predicting hard drive performance Use cases: System simulations File system design Quality of service / real-time guarantees (Anna Povzner’s work with Fahrrad) Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 1 / 26

Complexity, part 1 Rotational latency = 8.33ms max at 7200 RPM 0 ( 2 5 0 0 s k e c c t o a r r s T ) ( 2 4 0 0 s 1 e k c t c o a r s r T ) Read/write head x a Spindle m s ... m 2 2 = ∼ k r e o e S t c d e S e s e p d c p a t a o B m r e 2 R S e c . t . o . r 1 s r o t S S c e e e c s t c o r r e t 0 a s S p o r o r t 2 c e s S e e Arm c r a t o p r S 1 S e 0 c t o r Skew Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 2 / 26

Complexity, part 2 Serpentine Track 8 Track 7 Track 6 Track 2 Track 1 Track 0 Spindle Platter Track 9 Track 10 Track 11 Track 3 Track 4 Track 5 Serpentine Track 8 Track 7 Track 6 Track 2 Track 1 Track 0 Spindle Platter Track 11 Track 10 Track 9 Track 5 Track 4 Track 3 Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 3 / 26

Complexity, part 3 Additionally: Queueing Scheduling Caching Readahead Write-back Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 4 / 26

Machine learning of hard drive performance Short term goals (this presentation): Automated Fast Long term goals (future work): Flexible Future-proof Device-independent Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 5 / 26

Offline vs. online machine learning Offline: separate train and test phases Requests Device freeze Training data Model Model Predictions Online: feedback from real device Requests feedback Model Compare Predictions Device Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 6 / 26

Offline vs. online use cases System simulations (offline) File system design (offline) Quality of service / real-time guarantees (offline/online) Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 7 / 26

Existing machine learning approaches: demerit 1 0.9 0.8 0.7 0.6 Percentile 0.5 0.4 0.3 0.2 0.1 Actual CDF Predicted CDF 0 0 2 4 6 8 10 Latency (ms) Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 8 / 26

Existing machine learning approaches: average in time slice 20 Latency (ms) 15 0 1 2 3 4 e e e e e c c c c c i i i i i l l l l l S S S S S 10 e e e e e m m m m m i i i i i T T T T T 5 0 0 2 4 6 8 10 Wall clock time (s) For each time slice:  → r 0 prediction 0   r 1 → prediction 1   → average → compare with real average . . . . . .    → r n prediction n  Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 9 / 26

Existing machine learning approaches: predict average 20 Latency (ms) 15 0 1 2 3 4 e e e e e c c c c c i i i i i l l l l l S S S S S 10 e e e e e m m m m m i i i i i T T T T T 5 0 0 2 4 6 8 10 Wall clock time (s) For each time slice:  r 0   r 1   → aggregate → predict average → compare with real average . . .    r n  Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 10 / 26

Existing machine learning approaches: limitations All aggregate. None predict individual latencies with low error. Hard part? Access times. Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 11 / 26

Workload Characteristics: Minimizes: Random Caching Read-only Readahead Single-sector Write-back Full utilization Transfer time First serpentine Request arrival time sensitivity Track length variation Workload emphasizes access time (which is a hard problem by itself) and de-emphasizes everything else. Other workloads are future work. Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 12 / 26

Access time breakdown 35 30 25 Access time (ms) 20 Rotational latency 15 Short seek latency 10 5 Long seek latency (synthetic data) 0 0 2e+08 4e+08 6e+08 8e+08 1e+09 Distance from sector 0 Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 13 / 26

Access times: unpredictable? Why are access times hard to predict? Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 14 / 26

Access times: unpredictable? Why are access times hard to predict? Rotational layout Serpentines Sector sparing Skew Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 14 / 26

Access times: unpredictable? Why are access times hard to predict? Rotational layout Serpentines Sector sparing Skew What do these have in common? Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 14 / 26

Access times: unpredictable? Why are access times hard to predict? Rotational layout Serpentines Sector sparing Skew What do these have in common? Periodicity! Most machine learning algorithms cannot directly predict periodic functions well. Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 14 / 26

Access time function 10000 9 8 8000 7 6 Access time (ms) End sector ( b ) 6000 5 4 4000 3 2 2000 1 0 0 0 2000 4000 6000 8000 10000 Start sector ( a ) Full table is 1 billion by 1 billion entries, would take approximately 500 million years to capture data and 3.5 exabytes to store it. Extremely sparse sampling is required, must compute on the fly. Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 15 / 26

Input augmentation Key idea 1 of 2: add sines and cosines to inputs   a sin(2 π a / p 1 )     cos(2 π a / p 1 )     sin(2 π a / p 2 )     cos(2 π a / p 2 )   � a  .  . �   .   →   b b     sin(2 π b / p 1 )     cos(2 π b / p 1 )     sin(2 π b / p 2 )     cos(2 π b / p 2 )     . . . ( a is the start sector, b is the end sector) Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 16 / 26

Fourier transform 200 1 | ˆ f ( u , v ) | 0.9 150 0.8 100 0.7 50 0.6 0 0.5 v 0.4 -50 0.3 -100 0.2 -150 0.1 -200 0 -200 -150 -100 -50 0 50 100 150 200 u Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 17 / 26

Fourier transform Key idea 2 of 2: search on diagonal to limit computation 200 1 | ˆ f ( u , v ) | 0.9 150 0.8 100 0.7 50 0.6 0 0.5 v 0.4 -50 0.3 -100 0.2 -150 0.1 -200 0 -200 -150 -100 -50 0 50 100 150 200 u Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 17 / 26

Decision trees 1.5 1 0.5 0 -0.5 -1 -1.5 -1.5 -1 -0.5 0 0.5 1 1.5 Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 18 / 26

Interdependence 1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 19 / 26

Neural net basics y = f ( � w i x i + b ) x 1 x 2 y Neuron x 3 Usually, f ( x ) = tanh( x ) (or similar). Final output may use f ( x ) = x . Training: given input x i and desired output y ∗ , adjust w i and b such that y = y ∗ Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 20 / 26

Flat neural net a sin (2 π a / p 1 ) cos (2 π a / p 1 ) access time . . . b sin (2 π b / p 1 ) cos (2 π b / p 1 ) ( a is the start sector, b is the end sector) Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 21 / 26

Neural net with shared weights Subnet 1 a sin (2 π a / p 1 ) . . . . . . cos (2 π a / p 1 ) access time Subnet 2 . . . b sin (2 π b / p 1 ) . . . . . . cos (2 π b / p 1 ) ( a is the start sector, b is the end sector) Crume, Maltzahn, Ward, Kroeger, Curry, Oldfield, Widener Hard Drive Access Times 22 / 26

Fourier-Assisted Machine Learning of Hard Disk Drive Access Time - PowerPoint PPT Presentation

Fourier-Assisted Machine Learning of Hard Disk Drive Access Time Models Adam Crume 1 Carlos Maltzahn 1 Lee Ward 2 Thomas Kroeger 2 Matthew Curry 2 Ron Oldfield 2 Patrick Widener 2 1 University of California, Santa Cruz { adamcrume, carlosm }

Disk Management Disk Structure Disk Scheduling RAID Disk Block Management

Fourier Series and Transform Overview Why Fourier transform? Trigonometric functions Who is

Chapter 4 Chapter 4 The Fourier Series and The Fourier Series and Fourier Transform Fourier

Chapter 4 Chapter 4 The Fourier Series and The Fourier Series and Fourier Transform Fourier

Disk Storage Disk Storage Different types of disk storage: The smallest addressable unit

HARD DISK DRIVES Performance Storage capacity Software support Reliability Why we

CPSC 410/611: Disk Management Disk Structure Disk Scheduling RAID Disk Block

Fourier Series Fourier Sine Series Fourier Cosine Series Fourier Series Convergence

Topic 5: Discrete-Time Fourier Transform (DTFT) o DT Fourier Transform o Overview of Fourier

Today How is data saved in the hard disk? Magnetic disk Disk speed parameters Disk

1 2 Single Disk (a) Side view of a magnetic disk. (b) Top view of a magnetic disk. 3

CPSC 410/611: Disk Management Disk Structure Disk Scheduling RAID

LUBRICANT DEWETTING A BRICANT DEWETTING AT THE HEAD-DISK THE HEAD-DISK INTERF INTERFACE IN A

Topic 4: Continuous-Time Fourier Transform (CTFT) o Introduction to Fourier Transform o Fourier

Hard Disk Writing Process Jason Hoople, Joel Barry, Jesse Muszynski, Joanna Dobeck Hard Disk

CPSC 410/ 611: Week 9 Disk St ruct ure Disk Scheduling RAI D Disk Block

An empirical comparison of CNNs and other methods for classification of protein subcellular

Population Models Fundamentals of Computational Neuroscience, T. P. Trappenberg, 2002. Lecture

A Neuromuscular Model of Human Locomotion and its Applications to Robotic Devices The 10th

Quality Payment Program Millie Suk, JD, MPP AANE M He a lth Po lic y Dire c to r American

Who Am I Really? Insights from Neuropsychology about Not Taking Life Personally October 30, 2011

Detection of dependence patterns with delay J. Chevallier T. Lalo LJAD University of Nice

Coding and computing with balanced spiking networks Sophie Deneve Ecole Normale Suprieure,

Let the AI do the Talk Adventures with Natural Language Generation @MarcoBonzanini PyParis 2018

Sambuz

Useful Links

Newsletter

Mail Us