CLIC CLient-Informed Caching for Storage Servers Xin Liu Ashraf - - PowerPoint PPT Presentation

clic client informed caching for storage servers
SMART_READER_LITE
LIVE PREVIEW

CLIC CLient-Informed Caching for Storage Servers Xin Liu Ashraf - - PowerPoint PPT Presentation

Introduction Hinting CLIC Performance Conclusion CLIC CLient-Informed Caching for Storage Servers Xin Liu Ashraf Aboulnaga Ken Salem Xuhui Li David R. Cheriton School of Computer Science University of Waterloo February 2009


slide-1
SLIDE 1

Introduction Hinting CLIC Performance Conclusion

CLIC CLient-Informed Caching for Storage Servers

Xin Liu Ashraf Aboulnaga Ken Salem Xuhui Li

David R. Cheriton School of Computer Science University of Waterloo

February 2009

slide-2
SLIDE 2

Introduction Hinting CLIC Performance Conclusion

Two-Tier Caching

DBMS cache cache server storage p

slide-3
SLIDE 3

Introduction Hinting CLIC Performance Conclusion

Two-Tier Caching

DBMS cache cache server storage p

  • 1. read(p)
slide-4
SLIDE 4

Introduction Hinting CLIC Performance Conclusion

Two-Tier Caching

DBMS cache cache server storage p

  • 1. read(p)
  • 2. read(p)
slide-5
SLIDE 5

Introduction Hinting CLIC Performance Conclusion

Two-Tier Caching

DBMS cache cache server storage p

  • 1. read(p)
  • 2. read(p)
  • 3. fetch

p

slide-6
SLIDE 6

Introduction Hinting CLIC Performance Conclusion

Two-Tier Caching

DBMS cache cache server storage p

  • 1. read(p)
  • 2. read(p)
  • 3. fetch

p

  • 4. fetch

p

slide-7
SLIDE 7

Introduction Hinting CLIC Performance Conclusion

Two-Tier Caching

DBMS cache cache server storage p

  • 1. read(p)
  • 2. read(p)
  • 3. fetch

p

  • 4. fetch

p Problems:

  • cache inclusion
slide-8
SLIDE 8

Introduction Hinting CLIC Performance Conclusion

Two-Tier Caching

DBMS cache cache server storage p

  • 1. read(p)
  • 2. read(p)
  • 3. fetch

p

  • 4. fetch

p Problems:

  • cache inclusion
  • poor temporal locality
slide-9
SLIDE 9

Introduction Hinting CLIC Performance Conclusion

Two-Tier Caching

DBMS cache cache server storage p

  • 1. read(p)
  • 2. read(p)
  • 3. fetch

p

  • 4. fetch

p Problems:

  • cache inclusion
  • poor temporal locality

One Solution:

  • hinting
slide-10
SLIDE 10

Introduction Hinting CLIC Performance Conclusion

Example: Write Hints

DBMS cache cache server storage p write(p) p p

slide-11
SLIDE 11

Introduction Hinting CLIC Performance Conclusion

Example: Write Hints

DBMS cache cache server storage p write(p) p p

write replacement this is a

slide-12
SLIDE 12

Introduction Hinting CLIC Performance Conclusion

Example: Write Hints

DBMS cache cache server storage p write(p) p p

write replacement this is a this is a good candidate for caching

slide-13
SLIDE 13

Introduction Hinting CLIC Performance Conclusion

Example: Write Hints

DBMS cache cache server storage p write(p) p p

write replacement this is a this is a good candidate for caching

The storage server can use TQ, an ad hoc hint-aware replacement policy, to exploit write hints.

slide-14
SLIDE 14

Introduction Hinting CLIC Performance Conclusion

Problems with Ad Hoc Hint-Aware Policies

narrowness: new hints? multiple hints?

slide-15
SLIDE 15

Introduction Hinting CLIC Performance Conclusion

Problems with Ad Hoc Hint-Aware Policies

narrowness: new hints? multiple hints? brittleness: correct response to hints?

slide-16
SLIDE 16

Introduction Hinting CLIC Performance Conclusion

Problems with Ad Hoc Hint-Aware Policies

narrowness: new hints? multiple hints? brittleness: correct response to hints? single source: multiple hint generators?

DBMS cache cache server storage DBMS cache p q write(p) write(q) p p q q

replacement write this is a this is a replacement write should I cache p or q ????

slide-17
SLIDE 17

Introduction Hinting CLIC Performance Conclusion

The CLIC Approach

  • a hint-aware caching policy for 2nd-tier caches
slide-18
SLIDE 18

Introduction Hinting CLIC Performance Conclusion

The CLIC Approach

  • a hint-aware caching policy for 2nd-tier caches
  • no hard coded response to specific hints
slide-19
SLIDE 19

Introduction Hinting CLIC Performance Conclusion

The CLIC Approach

  • a hint-aware caching policy for 2nd-tier caches
  • no hard coded response to specific hints
  • instead, learn which hints signal good caching
  • pportunities
slide-20
SLIDE 20

Introduction Hinting CLIC Performance Conclusion

The CLIC Approach

  • a hint-aware caching policy for 2nd-tier caches
  • no hard coded response to specific hints
  • instead, learn which hints signal good caching
  • pportunities
  • benefits:
  • handles multiple hint types
  • handles new hint types
  • handles hints from multiple clients by treating each client’s

hints as distinct

slide-21
SLIDE 21

Introduction Hinting CLIC Performance Conclusion

The CLIC Approach

  • a hint-aware caching policy for 2nd-tier caches
  • no hard coded response to specific hints
  • instead, learn which hints signal good caching
  • pportunities
  • benefits:
  • handles multiple hint types
  • handles new hint types
  • handles hints from multiple clients by treating each client’s

hints as distinct

CLIC Hints

CLIC separates the generation of hints (done by the storage clients) from the interpretation of those hints for caching purposes (done by the storage server).

slide-22
SLIDE 22

Introduction Hinting CLIC Performance Conclusion

CLIC Illustrated

DBMS cache cache server storage p read(p)

this is a blargh gorp read

I don’t know blargh

  • r

gorp blargh gorp reads have been good candidates, so I will cache p but previous

slide-23
SLIDE 23

Introduction Hinting CLIC Performance Conclusion

Generating Hints

  • Storage client must be modified to generate one or more

types of hints.

  • Storage clients attach a hint set to each read or write
  • request. A hint set includes one hint of each type

generated by the client.

  • A storage client may choose to generate any types of hints

that might be of use to the storage server.

slide-24
SLIDE 24

Introduction Hinting CLIC Performance Conclusion

Generating Hints

  • Storage client must be modified to generate one or more

types of hints.

  • Storage clients attach a hint set to each read or write
  • request. A hint set includes one hint of each type

generated by the client.

  • A storage client may choose to generate any types of hints

that might be of use to the storage server.

Example: Hints from DB2

  • buffer pool ID
  • object ID: identifies a group of related DB objects
  • object type ID: distinguishes table from index
  • request type: read, replacement/recovery write
  • DB2 buffer priority
slide-25
SLIDE 25

Introduction Hinting CLIC Performance Conclusion

A CLIC-Managed Cache

lowest priority highest priority p1 p5 p2 p9 p6 p8 p4 p7 p3 He Hd Hc Hb Ha

  • each page is associated with the

hint set which it was most-recently read or written

slide-26
SLIDE 26

Introduction Hinting CLIC Performance Conclusion

A CLIC-Managed Cache

lowest priority highest priority p1 p5 p2 p9 p6 p8 p4 p7 p3 He Hd Hc Hb Ha

  • each page is associated with the

hint set which it was most-recently read or written

  • each hint set has a priority
slide-27
SLIDE 27

Introduction Hinting CLIC Performance Conclusion

A CLIC-Managed Cache

lowest priority highest priority p1 p5 p2 p9 p6 p8 p4 p7 p3 He Hd Hc Hb Ha

  • each page is associated with the

hint set which it was most-recently read or written

  • each hint set has a priority
  • CLIC evicts pages associated

with the lowest-priority hint sets

slide-28
SLIDE 28

Introduction Hinting CLIC Performance Conclusion

A CLIC-Managed Cache

lowest priority highest priority p1 p5 p2 p9 p6 p8 p4 p7 p3 He Hd Hc Hb Ha

  • each page is associated with the

hint set which it was most-recently read or written

  • each hint set has a priority
  • CLIC evicts pages associated

with the lowest-priority hint sets

  • CLIC chooses hint set priorities

using a simple cost/benefit analysis

slide-29
SLIDE 29

Introduction Hinting CLIC Performance Conclusion

Cost/Benefit Analysis

(p,H) read or write request next request for p cache p here?? time

slide-30
SLIDE 30

Introduction Hinting CLIC Performance Conclusion

Cost/Benefit Analysis

is this a read request? (p,H) read or write request next request for p cache p here?? time

  • there is a benefit to caching if the next request for p is a

read request

slide-31
SLIDE 31

Introduction Hinting CLIC Performance Conclusion

Cost/Benefit Analysis

is this a read request? (p,H) read or write request next request for p cache p here?? time

  • there is a benefit to caching if the next request for p is a

read request

  • the cost of obtaining this benefit is that p must remain

cached until the read request

slide-32
SLIDE 32

Introduction Hinting CLIC Performance Conclusion

Assigning Priorities to Hint Sets

is this a read request? (p,H) read or write request next request for p cache p here?? time

  • when request (p, H) occurs, CLIC cannot know the the

cost and benefit of caching p

slide-33
SLIDE 33

Introduction Hinting CLIC Performance Conclusion

Assigning Priorities to Hint Sets

is this a read request? (p,H) read or write request next request for p cache p here?? time

  • when request (p, H) occurs, CLIC cannot know the the

cost and benefit of caching p

  • instead CLIC estimates the cost and benefit of caching p at

(p, H) based on previous requests with hint set H

slide-34
SLIDE 34

Introduction Hinting CLIC Performance Conclusion

Assigning Priorities to Hint Sets

is this a read request? (p,H) read or write request next request for p cache p here?? time

  • when request (p, H) occurs, CLIC cannot know the the

cost and benefit of caching p

  • instead CLIC estimates the cost and benefit of caching p at

(p, H) based on previous requests with hint set H

  • CLIC assigns a priority to each hint set based on the cost

and benefit of previous requests with hint set H Priority(H) = Read Hit Rate(H) Mean Time Until Read Hit(H)

slide-35
SLIDE 35

Introduction Hinting CLIC Performance Conclusion

DB2 Hint Analysis Example

STOCK table replacement writes ORDERLINE table reads

slide-36
SLIDE 36

Introduction Hinting CLIC Performance Conclusion

Efficient Hint Analysis

  • To analyze the cost and benefit of hint sets, CLIC must
  • track the most recent request and hint set for each page
  • track the mean read hit rate and read hit distance for each

hint set

slide-37
SLIDE 37

Introduction Hinting CLIC Performance Conclusion

Efficient Hint Analysis

  • To analyze the cost and benefit of hint sets, CLIC must
  • track the most recent request and hint set for each page
  • track the mean read hit rate and read hit distance for each

hint set

  • To reduce space requirements, CLIC
  • tracks the most recent request only for cached pages and a

fixed number of additional, uncached paged

slide-38
SLIDE 38

Introduction Hinting CLIC Performance Conclusion

Efficient Hint Analysis

  • To analyze the cost and benefit of hint sets, CLIC must
  • track the most recent request and hint set for each page
  • track the mean read hit rate and read hit distance for each

hint set

  • To reduce space requirements, CLIC
  • tracks the most recent request only for cached pages and a

fixed number of additional, uncached paged

  • tracks read hit statistics only for frequently occurring hint

sets

slide-39
SLIDE 39

Introduction Hinting CLIC Performance Conclusion

Efficient Hint Analysis

  • To analyze the cost and benefit of hint sets, CLIC must
  • track the most recent request and hint set for each page
  • track the mean read hit rate and read hit distance for each

hint set

  • To reduce space requirements, CLIC
  • tracks the most recent request only for cached pages and a

fixed number of additional, uncached paged

  • tracks read hit statistics only for frequently occurring hint

sets

  • We have also investigated the use of generalization to

reduce the number of distinct hint sets.

slide-40
SLIDE 40

Introduction Hinting CLIC Performance Conclusion

Performance

  • we have used trace-driven simulation of the storage server

buffer cache to compare CLIC to other replacement policies

  • methodology
  • 1. modify DB2 and MySQL to generate hints and produce I/O

traces

  • 2. run TPC-C (on-line transaction processing) and TPC-H

(decision support) workloads on the database systems and collect I/O traces

  • 3. feed the traces to a simulation of second-tier cache, which

implements CLIC, LRU, ARC, TQ and OPT

  • 4. measure the hit ratio achieved by different policies.
slide-41
SLIDE 41

Introduction Hinting CLIC Performance Conclusion

DB2 TPC-C - Medium DB2 Buffer Cache

60k 120k 180k 240k 300k Server Cache Size (pages) 0% 20% 40% 60% 80% 100% Server Cache Read Hit Ratio DB2_C300

OPT TQ LRU ARC CLIC

slide-42
SLIDE 42

Introduction Hinting CLIC Performance Conclusion

DB2 TPC-H - Medium DB2 Buffer Cache

60k 120k 180k 240k 300k Server Cache Size (pages) 0% 20% 40% 60% 80% 100% Server Cache Read Hit Ratio DB2_H400

OPT TQ LRU ARC CLIC

slide-43
SLIDE 43

Introduction Hinting CLIC Performance Conclusion

DB2 TPC-C - Small DB2 Buffer Cache

60k 120k 180k 240k 300k Server Cache Size (pages) 0% 20% 40% 60% 80% 100% Server Cache Read Hit Ratio DB2_C60

OPT TQ LRU ARC CLIC

slide-44
SLIDE 44

Introduction Hinting CLIC Performance Conclusion

DB2 TPC-C - Large DB2 Buffer Cache

60k 120k 180k 240k 300k Server Cache Size (pages) 0% 20% 40% 60% 80% 100% Server Cache Read Hit Ratio DB2_C540

OPT TQ LRU ARC CLIC

slide-45
SLIDE 45

Introduction Hinting CLIC Performance Conclusion

Summary and Conclusions

  • CLIC learns to identify I/O hints that signal good caching
  • pportunities by tracking the request stream observed by

the second-tier cache

  • Because CLIC’s responses to specific hints are not

predefined, it naturally accommodates new hint types and hints from multiple storage clients.

  • for our traces:
  • CLIC’s performance usually dominates ARC’s and LRU’s,

sometimes by a factor of 2 or more.

  • CLIC dominates the ad hoc, hint-aware TQ algorithm
  • CLIC’s space overhead can be kept low ( 1% of storage

server cache size in our experiments)