UPGRAID Usage-based striPe replicatinG RAID Joseph Naps, Ellen - - PowerPoint PPT Presentation

upgraid
SMART_READER_LITE
LIVE PREVIEW

UPGRAID Usage-based striPe replicatinG RAID Joseph Naps, Ellen - - PowerPoint PPT Presentation

UPGRAID Joseph Naps, Ellen Wagner UPGRAID Usage-based striPe replicatinG RAID Joseph Naps, Ellen Wagner August 10, 2007 Project Overview UPGRAID Joseph Naps, UPGRAID Partition Ellen Wagner RAID


slide-1
SLIDE 1

UPGRAID Joseph Naps, Ellen Wagner

UPGRAID

Usage-based striPe replicatinG RAID Joseph Naps, Ellen Wagner August 10, 2007

slide-2
SLIDE 2

UPGRAID Joseph Naps, Ellen Wagner

Project Overview

UPGRAID Partition

  • RAID Partition
slide-3
SLIDE 3

UPGRAID Joseph Naps, Ellen Wagner

Project Overview

slide-4
SLIDE 4

UPGRAID Joseph Naps, Ellen Wagner

Project Overview

slide-5
SLIDE 5

UPGRAID Joseph Naps, Ellen Wagner

Project Overview

slide-6
SLIDE 6

UPGRAID Joseph Naps, Ellen Wagner

Project Overview

slide-7
SLIDE 7

UPGRAID Joseph Naps, Ellen Wagner

Project Overview

slide-8
SLIDE 8

UPGRAID Joseph Naps, Ellen Wagner

What We Learned

slide-9
SLIDE 9

UPGRAID Joseph Naps, Ellen Wagner

What We Learned

Kernel Compilation

slide-10
SLIDE 10

UPGRAID Joseph Naps, Ellen Wagner

What We Learned

Kernel Compilation Virtual Machines

slide-11
SLIDE 11

UPGRAID Joseph Naps, Ellen Wagner

What We Learned

Kernel Compilation Virtual Machines Modules

slide-12
SLIDE 12

UPGRAID Joseph Naps, Ellen Wagner

What We Learned

Kernel Compilation Virtual Machines Modules RAID

slide-13
SLIDE 13

UPGRAID Joseph Naps, Ellen Wagner

What We Learned

Kernel Compilation Virtual Machines Modules RAID Reading poorly documented code

slide-14
SLIDE 14

UPGRAID Joseph Naps, Ellen Wagner

What We Learned

Kernel Compilation Virtual Machines Modules RAID Reading poorly documented code Properly documenting code

slide-15
SLIDE 15

UPGRAID Joseph Naps, Ellen Wagner

What We Learned

Kernel Compilation Virtual Machines Modules RAID Reading poorly documented code Properly documenting code Working with low-level C code

slide-16
SLIDE 16

UPGRAID Joseph Naps, Ellen Wagner

What We Learned

Kernel Compilation Virtual Machines Modules RAID Reading poorly documented code Properly documenting code Working with low-level C code Block I/Os in Linux

slide-17
SLIDE 17

UPGRAID Joseph Naps, Ellen Wagner

Approach

slide-18
SLIDE 18

UPGRAID Joseph Naps, Ellen Wagner

Approach

Read Replication

slide-19
SLIDE 19

UPGRAID Joseph Naps, Ellen Wagner

Approach

Read Replication Write Replication

slide-20
SLIDE 20

UPGRAID Joseph Naps, Ellen Wagner

Approach

Read Replication Write Replication Read Indirection

slide-21
SLIDE 21

UPGRAID Joseph Naps, Ellen Wagner

Approach

Read Replication Write Replication Read Indirection Write Indirection

slide-22
SLIDE 22

UPGRAID Joseph Naps, Ellen Wagner

Approach - Read Replication

slide-23
SLIDE 23

UPGRAID Joseph Naps, Ellen Wagner

Approach - Read Replication

1 UPGRAID determines if the stripe is eligible for replication.

slide-24
SLIDE 24

UPGRAID Joseph Naps, Ellen Wagner

Approach - Read Replication

1 UPGRAID determines if the stripe is eligible for replication. 2 If the stripe is eligible, a read request to the entire stripe is

generated.

slide-25
SLIDE 25

UPGRAID Joseph Naps, Ellen Wagner

Approach - Read Replication

1 UPGRAID determines if the stripe is eligible for replication. 2 If the stripe is eligible, a read request to the entire stripe is

generated.

3 Once that read request completes, a write is generated

and put into a queue to await being sent to an UPGRAID partition.

slide-26
SLIDE 26

UPGRAID Joseph Naps, Ellen Wagner

Approach - Read Replication

slide-27
SLIDE 27

UPGRAID Joseph Naps, Ellen Wagner

Approach - Read Replication

slide-28
SLIDE 28

UPGRAID Joseph Naps, Ellen Wagner

Approach - Write Replication

slide-29
SLIDE 29

UPGRAID Joseph Naps, Ellen Wagner

Approach - Write Replication

1 UPGRAID determines if the stripe is eligible for replication.

slide-30
SLIDE 30

UPGRAID Joseph Naps, Ellen Wagner

Approach - Write Replication

1 UPGRAID determines if the stripe is eligible for replication. 2 If the stripe is eligible a read request to the entire stripe is

generated.

slide-31
SLIDE 31

UPGRAID Joseph Naps, Ellen Wagner

Approach - Write Replication

1 UPGRAID determines if the stripe is eligible for replication. 2 If the stripe is eligible a read request to the entire stripe is

generated.

3 At this point there are sixteen pages (in the page of a

sixty-four KB stripe) with the data from the original stripe.

slide-32
SLIDE 32

UPGRAID Joseph Naps, Ellen Wagner

Approach - Write Replication

1 UPGRAID determines if the stripe is eligible for replication. 2 If the stripe is eligible a read request to the entire stripe is

generated.

3 At this point there are sixteen pages (in the page of a

sixty-four KB stripe) with the data from the original stripe.

4 The data from the original write must now be overlaid on

top of the data read from the stripe to preserve the modifications from the write.

slide-33
SLIDE 33

UPGRAID Joseph Naps, Ellen Wagner

Approach - Write Replication

1 UPGRAID determines if the stripe is eligible for replication. 2 If the stripe is eligible a read request to the entire stripe is

generated.

3 At this point there are sixteen pages (in the page of a

sixty-four KB stripe) with the data from the original stripe.

4 The data from the original write must now be overlaid on

top of the data read from the stripe to preserve the modifications from the write.

5 The modified write is sent to a queue to await submission

to the proper UPGRAID partition.

slide-34
SLIDE 34

UPGRAID Joseph Naps, Ellen Wagner

Approach - Write Replication

slide-35
SLIDE 35

UPGRAID Joseph Naps, Ellen Wagner

Approach - Write Replication

slide-36
SLIDE 36

UPGRAID Joseph Naps, Ellen Wagner

Approach - Write Replication

slide-37
SLIDE 37

UPGRAID Joseph Naps, Ellen Wagner

Approach - Write Replication

slide-38
SLIDE 38

UPGRAID Joseph Naps, Ellen Wagner

Approach - Read Indirection

slide-39
SLIDE 39

UPGRAID Joseph Naps, Ellen Wagner

Approach - Read Indirection

1 UPGRAID determines if the request should be sent to the

RAID5 partition or UPGRAID partition by looking at the head position of each drive. This drive that has the smallest distance to move is chosen to fulfill the request.

slide-40
SLIDE 40

UPGRAID Joseph Naps, Ellen Wagner

Approach - Read Indirection

1 UPGRAID determines if the request should be sent to the

RAID5 partition or UPGRAID partition by looking at the head position of each drive. This drive that has the smallest distance to move is chosen to fulfill the request.

2 The request is then sent to the appropriate disk and the

application proceeds upon completion of that read request.

slide-41
SLIDE 41

UPGRAID Joseph Naps, Ellen Wagner

Approach - Write Indirection

slide-42
SLIDE 42

UPGRAID Joseph Naps, Ellen Wagner

Approach - Write Indirection

1 The write request to the RAID5 partition is cloned.

slide-43
SLIDE 43

UPGRAID Joseph Naps, Ellen Wagner

Approach - Write Indirection

1 The write request to the RAID5 partition is cloned. 2 This cloned request gets sent to the appropriate location

  • n the UPGRAID partition at the same offset into the

stripe as the original write, thereby preserving the mirroring property between the two stripes.

slide-44
SLIDE 44

UPGRAID Joseph Naps, Ellen Wagner

Testing Tools

slide-45
SLIDE 45

UPGRAID Joseph Naps, Ellen Wagner

Testing Tools

Integrity Checker

slide-46
SLIDE 46

UPGRAID Joseph Naps, Ellen Wagner

Testing Tools

Integrity Checker Workload Profiler

slide-47
SLIDE 47

UPGRAID Joseph Naps, Ellen Wagner

Integrity Checker

slide-48
SLIDE 48

UPGRAID Joseph Naps, Ellen Wagner

Integrity Checker

Automated user level application to test reads and writes to specific blocks.

slide-49
SLIDE 49

UPGRAID Joseph Naps, Ellen Wagner

Integrity Checker

Automated user level application to test reads and writes to specific blocks. Uses:

slide-50
SLIDE 50

UPGRAID Joseph Naps, Ellen Wagner

Integrity Checker

Automated user level application to test reads and writes to specific blocks. Uses:

Check and see if data was written to the correct block.

slide-51
SLIDE 51

UPGRAID Joseph Naps, Ellen Wagner

Integrity Checker

Automated user level application to test reads and writes to specific blocks. Uses:

Check and see if data was written to the correct block. Make sure that modules are performing correctly.

slide-52
SLIDE 52

UPGRAID Joseph Naps, Ellen Wagner

Integrity Checker

slide-53
SLIDE 53

UPGRAID Joseph Naps, Ellen Wagner

Integrity Checker

Proceeds through three testing phases:

slide-54
SLIDE 54

UPGRAID Joseph Naps, Ellen Wagner

Integrity Checker

Proceeds through three testing phases:

Write Phase

slide-55
SLIDE 55

UPGRAID Joseph Naps, Ellen Wagner

Integrity Checker

Proceeds through three testing phases:

Write Phase

Generates a write workload across the entire disk space.

slide-56
SLIDE 56

UPGRAID Joseph Naps, Ellen Wagner

Integrity Checker

Proceeds through three testing phases:

Write Phase

Generates a write workload across the entire disk space.

Read Phase

slide-57
SLIDE 57

UPGRAID Joseph Naps, Ellen Wagner

Integrity Checker

Proceeds through three testing phases:

Write Phase

Generates a write workload across the entire disk space.

Read Phase

Generates a random read workload across the disk space.

slide-58
SLIDE 58

UPGRAID Joseph Naps, Ellen Wagner

Integrity Checker

Proceeds through three testing phases:

Write Phase

Generates a write workload across the entire disk space.

Read Phase

Generates a random read workload across the disk space.

Read and Compare Phase

slide-59
SLIDE 59

UPGRAID Joseph Naps, Ellen Wagner

Integrity Checker

Proceeds through three testing phases:

Write Phase

Generates a write workload across the entire disk space.

Read Phase

Generates a random read workload across the disk space.

Read and Compare Phase

Reads back in the original write workload and compares the data to ensure there was no data corruption.

slide-60
SLIDE 60

UPGRAID Joseph Naps, Ellen Wagner

Workload Profiler

slide-61
SLIDE 61

UPGRAID Joseph Naps, Ellen Wagner

Workload Profiler

Generates a workload according to user specifications to test ABLE modules

slide-62
SLIDE 62

UPGRAID Joseph Naps, Ellen Wagner

Workload Profiler

Generates a workload according to user specifications to test ABLE modules Input Variables

slide-63
SLIDE 63

UPGRAID Joseph Naps, Ellen Wagner

Workload Profiler

Generates a workload according to user specifications to test ABLE modules Input Variables

percent sequential

slide-64
SLIDE 64

UPGRAID Joseph Naps, Ellen Wagner

Workload Profiler

Generates a workload according to user specifications to test ABLE modules Input Variables

percent sequential fraction writes

slide-65
SLIDE 65

UPGRAID Joseph Naps, Ellen Wagner

Workload Profiler

Generates a workload according to user specifications to test ABLE modules Input Variables

percent sequential fraction writes I/O request rate

slide-66
SLIDE 66

UPGRAID Joseph Naps, Ellen Wagner

Workload Profiler

Generates a workload according to user specifications to test ABLE modules Input Variables

percent sequential fraction writes I/O request rate average I/O size

slide-67
SLIDE 67

UPGRAID Joseph Naps, Ellen Wagner

Workload Profiler

Generates a workload according to user specifications to test ABLE modules Input Variables

percent sequential fraction writes I/O request rate average I/O size maximum I/O size

slide-68
SLIDE 68

UPGRAID Joseph Naps, Ellen Wagner

Workload Profiler

Generates a workload according to user specifications to test ABLE modules Input Variables

percent sequential fraction writes I/O request rate average I/O size maximum I/O size duration of experiment

slide-69
SLIDE 69

UPGRAID Joseph Naps, Ellen Wagner

Workload Profiler

Generates a workload according to user specifications to test ABLE modules Input Variables

percent sequential fraction writes I/O request rate average I/O size maximum I/O size duration of experiment seed for the random number generator

slide-70
SLIDE 70

UPGRAID Joseph Naps, Ellen Wagner

Workload Profiler

Generates a workload according to user specifications to test ABLE modules Input Variables

percent sequential fraction writes I/O request rate average I/O size maximum I/O size duration of experiment seed for the random number generator

Output Variables

slide-71
SLIDE 71

UPGRAID Joseph Naps, Ellen Wagner

Workload Profiler

Generates a workload according to user specifications to test ABLE modules Input Variables

percent sequential fraction writes I/O request rate average I/O size maximum I/O size duration of experiment seed for the random number generator

Output Variables

actual duration of experiment

slide-72
SLIDE 72

UPGRAID Joseph Naps, Ellen Wagner

Workload Profiler

Generates a workload according to user specifications to test ABLE modules Input Variables

percent sequential fraction writes I/O request rate average I/O size maximum I/O size duration of experiment seed for the random number generator

Output Variables

actual duration of experiment average I/O time

slide-73
SLIDE 73

UPGRAID Joseph Naps, Ellen Wagner

Workload Profiler

Generates a workload according to user specifications to test ABLE modules Input Variables

percent sequential fraction writes I/O request rate average I/O size maximum I/O size duration of experiment seed for the random number generator

Output Variables

actual duration of experiment average I/O time standard deviation

slide-74
SLIDE 74

UPGRAID Joseph Naps, Ellen Wagner

Workload Profiler

Generates a workload according to user specifications to test ABLE modules Input Variables

percent sequential fraction writes I/O request rate average I/O size maximum I/O size duration of experiment seed for the random number generator

Output Variables

actual duration of experiment average I/O time standard deviation throughput

slide-75
SLIDE 75

UPGRAID Joseph Naps, Ellen Wagner

Future Work

slide-76
SLIDE 76

UPGRAID Joseph Naps, Ellen Wagner

Future Work

Read Heuristic

slide-77
SLIDE 77

UPGRAID Joseph Naps, Ellen Wagner

Future Work

Read Heuristic Testing and debugging of replication, indirection, and popularity code

slide-78
SLIDE 78

UPGRAID Joseph Naps, Ellen Wagner

Future Work

Read Heuristic Testing and debugging of replication, indirection, and popularity code Reconstruction

slide-79
SLIDE 79

UPGRAID Joseph Naps, Ellen Wagner

Future Work - Read Heuristic

slide-80
SLIDE 80

UPGRAID Joseph Naps, Ellen Wagner

Future Work - Read Heuristic

A similar task is done in the RAID1 code.

slide-81
SLIDE 81

UPGRAID Joseph Naps, Ellen Wagner

Future Work - Read Heuristic

A similar task is done in the RAID1 code. We have looked into the code and think that it can be ported to UPGRAID with a few modifications.

slide-82
SLIDE 82

UPGRAID Joseph Naps, Ellen Wagner

Future Work - Testing and Debugging

slide-83
SLIDE 83

UPGRAID Joseph Naps, Ellen Wagner

Future Work - Testing and Debugging

Currently using autorwbench for the purpose of testing UPGRAID

slide-84
SLIDE 84

UPGRAID Joseph Naps, Ellen Wagner

Future Work - Testing and Debugging

Currently using autorwbench for the purpose of testing UPGRAID Once the system is more stable with autorwbench UPGRAID can be deployed on a file system.

slide-85
SLIDE 85

UPGRAID Joseph Naps, Ellen Wagner

Future Work - Reconstruction

slide-86
SLIDE 86

UPGRAID Joseph Naps, Ellen Wagner

Future Work - Reconstruction

Not considered in detail yet

slide-87
SLIDE 87

UPGRAID Joseph Naps, Ellen Wagner

Future Work - Reconstruction

Not considered in detail yet Two main approaches exist

slide-88
SLIDE 88

UPGRAID Joseph Naps, Ellen Wagner

Future Work - Reconstruction

Not considered in detail yet Two main approaches exist

Disk-Oriented Reconstruction (DOR)

slide-89
SLIDE 89

UPGRAID Joseph Naps, Ellen Wagner

Future Work - Reconstruction

Not considered in detail yet Two main approaches exist

Disk-Oriented Reconstruction (DOR) Popularity-based Reconstruction (PRO)

slide-90
SLIDE 90

UPGRAID Joseph Naps, Ellen Wagner

Future Work - Reconstruction

Not considered in detail yet Two main approaches exist

Disk-Oriented Reconstruction (DOR) Popularity-based Reconstruction (PRO)

An entirely new approach could be developed for UPGRAID

slide-91
SLIDE 91

UPGRAID Joseph Naps, Ellen Wagner

Future Work - Reconstruction via DOR

slide-92
SLIDE 92

UPGRAID Joseph Naps, Ellen Wagner

Future Work - Reconstruction via DOR

DOR works by generating a thread for each disk that is responsible for fulfilling requests to that disk for the purpose of rebuilding the data of the failed disk.

slide-93
SLIDE 93

UPGRAID Joseph Naps, Ellen Wagner

Future Work - Reconstruction via DOR

DOR works by generating a thread for each disk that is responsible for fulfilling requests to that disk for the purpose of rebuilding the data of the failed disk. There is also a master thread that is responsible for coordinating the actions of the disk threads.

slide-94
SLIDE 94

UPGRAID Joseph Naps, Ellen Wagner

Future Work - Reconstruction via DOR

DOR works by generating a thread for each disk that is responsible for fulfilling requests to that disk for the purpose of rebuilding the data of the failed disk. There is also a master thread that is responsible for coordinating the actions of the disk threads. It is possible that UPGRAID could work directly below the master thread and indirect rebuild requests for replicated blocks to the replicas stored on UPGRAID partitions.

slide-95
SLIDE 95

UPGRAID Joseph Naps, Ellen Wagner

Future Work - Reconstruction via PRO

slide-96
SLIDE 96

UPGRAID Joseph Naps, Ellen Wagner

Future Work - Reconstruction via PRO

PRO works by dividing the failed disk into “hot zones” and then rebuilding the zones based on the current access rate to that zone.

slide-97
SLIDE 97

UPGRAID Joseph Naps, Ellen Wagner

Future Work - Reconstruction via PRO

PRO works by dividing the failed disk into “hot zones” and then rebuilding the zones based on the current access rate to that zone. UPGRAID could sit above this process and use replicated stripes to improve this process.

slide-98
SLIDE 98

UPGRAID Joseph Naps, Ellen Wagner

Future Work - Reconstruction via PRO

PRO works by dividing the failed disk into “hot zones” and then rebuilding the zones based on the current access rate to that zone. UPGRAID could sit above this process and use replicated stripes to improve this process. This approach would likely be more complex but its popularity based operation seems like good fit with UPGRAID.

slide-99
SLIDE 99

UPGRAID Joseph Naps, Ellen Wagner

Future Work - Reconstruction via PRO

PRO works by dividing the failed disk into “hot zones” and then rebuilding the zones based on the current access rate to that zone. UPGRAID could sit above this process and use replicated stripes to improve this process. This approach would likely be more complex but its popularity based operation seems like good fit with UPGRAID. It may be good if we defined these “hot zones” to align with the stripes of the RAID5 disk. This would make reconstruction using the replicated stripes easier.

slide-100
SLIDE 100

UPGRAID Joseph Naps, Ellen Wagner

Extremely Preliminary Results

slide-101
SLIDE 101

UPGRAID Joseph Naps, Ellen Wagner

Extremely Preliminary Results

Tested with autorwbench using one block (512 byte) I/O

  • perations.
slide-102
SLIDE 102

UPGRAID Joseph Naps, Ellen Wagner

Extremely Preliminary Results

Tested with autorwbench using one block (512 byte) I/O

  • perations.

10MB write workload

slide-103
SLIDE 103

UPGRAID Joseph Naps, Ellen Wagner

Extremely Preliminary Results

Tested with autorwbench using one block (512 byte) I/O

  • perations.

10MB write workload 100MB read workload

slide-104
SLIDE 104

UPGRAID Joseph Naps, Ellen Wagner

Extremely Preliminary Results

Tested with autorwbench using one block (512 byte) I/O

  • perations.

10MB write workload 100MB read workload Run in a virtual machine

slide-105
SLIDE 105

UPGRAID Joseph Naps, Ellen Wagner

Extremely Preliminary Results

slide-106
SLIDE 106

UPGRAID Joseph Naps, Ellen Wagner

Extremely Preliminary Results

RAID5 Average - 12.187 seconds

slide-107
SLIDE 107

UPGRAID Joseph Naps, Ellen Wagner

Extremely Preliminary Results

RAID5 Average - 12.187 seconds UPGRAID Average - 13.1984 seconds

slide-108
SLIDE 108

UPGRAID Joseph Naps, Ellen Wagner

Extremely Preliminary Results

Due to the current instability of the system this data should be taken with a grain of salt.

slide-109
SLIDE 109

UPGRAID Joseph Naps, Ellen Wagner

Questions or Comments?