Manylogs Improving CMR/SMR Disk Bandwidth & Latency Tiratat - - PowerPoint PPT Presentation

manylogs
SMART_READER_LITE
LIVE PREVIEW

Manylogs Improving CMR/SMR Disk Bandwidth & Latency Tiratat - - PowerPoint PPT Presentation

Manylogs Improving CMR/SMR Disk Bandwidth & Latency Tiratat Patana-anake , Vincentius Martin, Nora Sandler, Cheng Wu, and Haryadi S. Gunawi 2 Manylogs @ MSST 16 Got 100% of the read bandwidth User 1 Big Read 3 Many Got 50%


slide-1
SLIDE 1

Manylogs

Improving CMR/SMR Disk Bandwidth & Latency

† Tiratat Patana-anake, Vincentius Martin†, Nora Sandler, Cheng Wu, and Haryadi S. Gunawi

slide-2
SLIDE 2 Manylogs @ MSST ‘16 2

User 1 Big Read Got 100% of the read bandwidth

slide-3
SLIDE 3 Many 3

User 2 Big Read User 1 Big Read Got 50% of the read bandwidth

Fair Share

slide-4
SLIDE 4 Manylogs @ MSST ‘16 4

1 2 3 N

1/N bandwidth Still Fair!

slide-5
SLIDE 5

Our

Manylogs @ MSST ‘16 5

User 1 Big Read Oh no! 5% bandwidth?! Small Durable Writes User 2

slide-6
SLIDE 6

Our m

Manylogs @ MSST ‘16 6

User 1 Big Read More Bandwidth Please! Small Durable Writes User 2 Faster Latency Please!

slide-7
SLIDE 7

J J

Data Journaling

Manylogs @ MSST ‘16 7

Ordered Journaling

Seek Journal Big Read Small Writes J J J J J J

slide-8
SLIDE 8

Data Journaling

Manylogs @ MSST ‘16 8

Ordered Journaling

J J Seek Journal Big Read Small Writes J J J J J J

slide-9
SLIDE 9

Data Journaling

Manylogs @ MSST ‘16 9

Ordered Journaling

Problems with Current Journaling

slide-10
SLIDE 10 Manylogs @ MSST ‘16 10

Problems with Current Journaling

J

Ordered Journaling Data Jo

First Write Second Write

slide-11
SLIDE 11

Introducing Manylogs

Manylogs @ MSST ‘16 11

Single Log Manylogs

J J J J J J J J 10 MB 100 MB

slide-12
SLIDE 12

J J J J Seek Journal Big Read Small Writes J J J J

Manylogs

Manylogs @ MSST ‘16 12

J J J J

Small writes made durable to the nearest log without seeking

slide-13
SLIDE 13

Manylogs

Manylogs @ MSST ‘16 13

q Reserved log spaces uniformly across the

disk

§ 10 MB every 100 MB

q Follow the disk head (last big I/O) q Redirect Small Writes (e.g. ≤ 256 KB)

§ Nearest log: log closest to last big I/O

q Sequential Writes are left untouched

J J J J J J J 10 MB 100 MB

slide-14
SLIDE 14 Manylogs @ MSST ‘16 14

Manylogs

Increased Read Throughput Reduced Write Latency

slide-15
SLIDE 15

Where are logs on the disk?

Manylogs @ MSST ‘16 15

J J J J J J J 10 MB 100 MB

The log space = whole platter

slide-16
SLIDE 16

Where are logs on the disk?

Manylogs @ MSST ‘16 16

J J J J J J J 10 MB 100 MB

The log space = whole platter

slide-17
SLIDE 17 Manylogs @ MSST ‘16 17

Same cylinder = No seek!

J J J J J J J

Log for others in the same cylinder

slide-18
SLIDE 18

Disk

User 1 128MB Sequential Reads 4KB Random Writes User 2 Latency (ms) At different intensities writes/s writes/s writes/s

  • 320 writes/s

Ordered vs. Data vs. Adaptive vs. Manylogs

Ratio of Max Read Bandwidth

Manylogs @ MSST ‘16 18
slide-19
SLIDE 19

Adaptive Journaling

Manylogs @ MSST ‘16 19

q Middle ground between ordered

journaling and data journaling

q Single-log design q Prabhakaran et al., ATC ‘05

slide-20
SLIDE 20

40 80 160 320 Random Writes per Second (IOPS)

Results

Manylogs @ MSST ‘16 20

Ordered Adaptive Data Manylogs 0% 20% 40% 60% 80% 100% Ratio of Max Bandwidth

slide-21
SLIDE 21

Results

Manylogs @ MSST ‘16 21

Ordered Adaptive Data Manylogs 0% 20% 40% 60% 80% 100 Ratio of Max Bandwidth Random Writes per Second (IOPS) 40 80 160 320 57% at 40 IOPS 5% at 320 IOPS

slide-22
SLIDE 22

Results

Manylogs @ MSST ‘16 22

Ordered Adaptive Data Manylogs 0% 20% 40% 60% 80% 100% Ratio of Max Bandwidth Random Writes per Second (IOPS) 40 80 160 320 9% at 320 IOPS 73% at 40 IOPS

slide-23
SLIDE 23

Results

Manylogs @ MSST ‘16 23

Ordered Adaptive Data Manylogs 0% 20% 40% 60% 80% 100% Ratio of Max Bandwidth Random Writes per Second (IOPS) 40 80 160 320

slide-24
SLIDE 24 Manylogs @ MSST ‘16 24

Ordered daptive Data Manylogs 60% 80% 100%

  • f Max Bandwidth

Random Writes per Second (IOPS) 40 80 160 320

Manylogs gives the most bandwidth 50% of max bandwidth at extreme IOPS Hey! What about my latency?

slide-25
SLIDE 25

Results

Manylogs @ MSST ‘16 25

Ordered Adaptive Data Manylogs 100 200 300 400 500 600 Average sync Latency (ms) Random Writes per Second (IOPS) 40 80 160 320

slide-26
SLIDE 26

Results

Manylogs @ MSST ‘16 26

100 200 300 400 500 600 Average sync Latency (ms) Random Writes per Second (IOPS) 40 80 160 320 Ordered Adaptive Data Manylogs

slide-27
SLIDE 27

Results

Manylogs @ MSST ‘16 27

Random Writes per Second (IOPS) 40 80 160 320 Average sync Latency (ms) 100 200 300 400 500 600 Ordered Adaptive Data Manylogs

slide-28
SLIDE 28

Results

Manylogs @ MSST ‘16 28

Random Writes per Second (IOPS) 40 80 160 320 Average sync Latency (ms) 100 200 300 400 500 600 Ordered Adaptive Data Manylogs

slide-29
SLIDE 29

Results

Manylogs @ MSST ‘16 29

Random Writes per Second (IOPS) 40 80 160 320 Average sync Latency (ms) 100 200 300 400 500 600

WOW! Fast latency at extreme IOPS!

Ordered Adaptive Data Manylogs

slide-30
SLIDE 30

Results

Manylogs @ MSST ‘16 30

Ordered Adaptive Data Manylogs 0% 20% 40% 60% 80% 100% Ratio of Max Bandwidth Random Writes per Second (IOPS) 40 80 160 320 Average sync Latency (ms) 100 200 300 400 500 600

slide-31
SLIDE 31

Results

Manylogs @ MSST ‘16 31
slide-32
SLIDE 32

User 1 128MB Sequential Reads “fileserver” User 2 Latency (ms)

  • Using Filebench
  • Multi-threaded
  • 2, 4, 8 instances

Ratio of Max Read Bandwidth

Manylogs @ MSST ‘16 32
slide-33
SLIDE 33 Manylogs @ MSST ‘16 33

Ordered Data

  • gs

0% 20% 40% 60% 80% 100% Ratio of Max Bandwidth fileserver Instances 2 4 8 Operation Latency (ms) 1000 2000 4000 3000

Manylogs provides the best outcomes!

slide-34
SLIDE 34

q “Lazy” or

“Off-hours”

q Rarely full

because just small writes are redirected

q Log Swapping

Manylogs

Checkpointing

Manylogs @ MSST ‘16 34

q Periodically

§ Usually every 5 secs

q Journal can get

filled fast because all writes are in the journal!

Data Journaling

slide-35
SLIDE 35

Hot Area! Cold Area!

Log Swapping

Manylogs @ MSST ‘16 35

J J J J Seek Journal Big Read Small Writes J J J J J J J J

slide-36
SLIDE 36

Integrations

Manylogs @ MSST ‘16 36

q File System (MLFS)

§ Durability-Only Mode (O_DUR)

q SMR Disk (MLSMR) q RAID

slide-37
SLIDE 37

Cassandra Write Path

Manylogs @ MSST ‘16 37

Writes SSTable Commit Log Memtable Disk Memory

slide-38
SLIDE 38

SSTable Memtable Commit Log

Cassandra Write Path

Manylogs @ MSST ‘16 38

Writes Disk Memory

slide-39
SLIDE 39

Disk Memory Ideally SYNC write Flush Triggered SSTable Memtable Commit Log

Cassandra Write Path

Manylogs @ MSST ‘16 39

Writes SSTable Memtable Commit Log Requires Fast Durability Temp File No location needed But in practice, background write e.g. every 10 seconds Commit Log

Random writes are the problem

😗

slide-40
SLIDE 40
  • pen(file, O_DUR);
Manylogs @ MSST ‘16 40

q Need fast durability but not location

constraints

q Content of files will be put in Manylogs

regardless of the write size

q Never checkpoint their content q Random writes are not a problem

anymore!

slide-41
SLIDE 41 Manylogs @ MSST ‘16 41

User 1 HDFS Ratio of Max Read Bandwidth MongoDB User 2 Latency (ms)

  • 1 instance
  • 2 instances
  • 4 instances
slide-42
SLIDE 42 Manylogs @ MSST ‘16 42

Data Manylogs 0% 20% 40% 60% 80% Ratio of Max Bandwidth MongoDB Write Latency (ms) 50 100 150 200 250 MongoDB Instances (w/Default flush period) 1DB 2DB 4DB

Most Bandwidth & Lowest Latency with Manylogs

slide-43
SLIDE 43

Manylogs & SMR

Manylogs @ MSST ‘16 43

One non-shingled surface = log space

slide-44
SLIDE 44

Manylogs & SMR

Manylogs @ MSST ‘16 44

Shingled Band Non-Shingled Tracks

slide-45
SLIDE 45 Manylogs @ MSST ‘16 45

User 1 Build Server (WBS) Trace Latency (ms) Back-end Server (LM-TBE) Trace User 2 Latency (ms)

slide-46
SLIDE 46

20 40 60 80 100 ML-SMR SL-SMR

Manylogs @ MSST ‘16 46

Manylogs SMR (MLSMR) Single-log SMR (SLSMR) 20 40 60 80 100 Percentile Latency (ms) 10 80 30 50 40

SMR

slide-47
SLIDE 47

Manylogs & RAID

Manylogs @ MSST ‘16 47 Mingzhe Hao, Gokul Soundararajan, Deepak Kenchammana-Hosekote, Andrew A. Chien, and Haryadi S.
  • Gunawi. "The T
ail at Store: A Revelation from Millions of Hours of Disk and SSD Deployments." FAST’16.

User 1 Big Read Small Durable Writes User 2

Disk #1 Disk #2 Disk #3 Disk #4

Max Bandwidth! Bandwidth drops up to 50%

slide-48
SLIDE 48

User 1 128MB Sequential Reads 4KB Random Writes User 2 Latency (ms) At different intensities

  • 40

writes/s

  • 80

writes/s

  • 160 writes/s
  • 320 writes/s

Ratio of Max Read Bandwidth

Manylogs @ MSST ‘16 48
slide-49
SLIDE 49 Manylogs @ MSST ‘16 49

Ordered Data Manylogs 0% 20% 40% 60% 80% Ratio of Max Bandwidth Random Writes per Second (IOPS) per Disk 40 80 160 320 Average sync Latency (ms) 100 200 300 400 500 600

10x Bandwidth Speed-up 14x Latency Speed-up at 320 IOPS!

slide-50
SLIDE 50

More in the paper

Manylogs @ MSST ‘16 50

q Block-Level Manylogs q Other workloads

§ Sequential Writes § “varmail”

§ More Traces

q Log Size q Logged Write Size q Mapping Table

slide-51
SLIDE 51

J J J J J J J J

Manylogs

Manylogs @ MSST ‘16 51

q Reserved log spaces uniformly across the

disk

q Redirect small writes to the nearest log q Can help with NoSQL, SMR, RAID, and more! q Provide up to 5x speed-up on average

slide-52
SLIDE 52

Manylogs

Manylogs @ MSST ‘16 52

Bandwidth Speed-up Latency Speed-up

  • vs. Ordered

3.7x 5.7x

  • vs. Adaptive

2.7x 2.0x

  • vs. Single-log SMR

1.3x

slide-53
SLIDE 53

Thank you! Questions?

Manylogs @ MSST ‘16 53

http://ucare.cs.uchicago.edu