Andrew Deason Sine Nomine Associates European AFS and Kerberos - - PowerPoint PPT Presentation

andrew deason sine nomine associates european afs and
SMART_READER_LITE
LIVE PREVIEW

Andrew Deason Sine Nomine Associates European AFS and Kerberos - - PowerPoint PPT Presentation

OpenAFS Out-of-Band TCP Andrew Deason Sine Nomine Associates European AFS and Kerberos Conference 2012 Agenda Why is AFS so slow? Project Background OOB Design Current Status (numbers!) Future Directions 2 Why is AFS so


slide-1
SLIDE 1

OpenAFS Out-of-Band TCP

Andrew Deason Sine Nomine Associates European AFS and Kerberos Conference 2012

slide-2
SLIDE 2

2

Agenda

  • Why is AFS so slow?
  • Project Background
  • OOB Design
  • Current Status (numbers!)
  • Future Directions
slide-3
SLIDE 3

Why is AFS so slow?

  • Define “performance” / “slow”
  • AFS-specific factors (cache, CBs, etc)
  • Inherent UDP restrictions

– Firewalls, checksum offloading, etc

3

slide-4
SLIDE 4

Why is AFS so slow?

  • Rx implementation and protocol

– See Simon’s talk(s)

  • Rx window size

– (32 * 1400) / RTT – 1ms RTT: ~43 MiB/s – 10ms RTT: ~4 MiB/s

4

slide-5
SLIDE 5

Project Background

  • AFS too slow for customer

– Need fix quickly

  • Declined approaches:

– RxOSD vicep-access – RxOSD non-vicep-access – RxTCP – RxUDP improvements

5

slide-6
SLIDE 6

Project Background

  • Compromise on TCP OOB

– Rx handles args, aborts, auth, etc – No long-lived TCP conns – Tie TCP conn to Rx call

  • Rapid development
  • First pass not public

6

slide-7
SLIDE 7

Project Background

  • Started in August/September 2011
  • 1.4 client/server delivered in October
  • 1.6 client in February
  • Production deployment in March/April
  • 1.6 server in May

7

slide-8
SLIDE 8

OOB Design

  • Designed for rapid dev
  • FTP-like control/data channels
  • Very similar to existing FetchData64
  • Not just for TCP

8

slide-9
SLIDE 9

OOB Design (protocol) Say a client wants to fetch a file…

9

Client Server

slide-10
SLIDE 10

FetchDataTCP

IN arguments

OOB Design (protocol) Client starts split FetchDataTCP call

10

Client Server

slide-11
SLIDE 11

FetchDataTCP

union AFSOOB_Challenge

OOB Design (protocol) Server sends TCP information

11

Client Server

slide-12
SLIDE 12

FetchDataTCP

OOB Design (protocol) Client creates TCP connection

12

Client Server

slide-13
SLIDE 13

union AFSTCP_Response

FetchDataTCP

OOB Design (protocol) Client send conn metadata (IDs Rx call)

13

Client Server

Note: data over TCP still in XDR

slide-14
SLIDE 14

FetchDataTCP

OOB Design (protocol) Server associates connection

14

Client Server

slide-15
SLIDE 15

union AFSTCP_FileData raw file data

FetchDataTCP

OOB Design (protocol) Server sends file data

15

Client Server

Note: Rx call is idle

slide-16
SLIDE 16

FetchDataTCP

OUT arguments

OOB Design (protocol) Server ends FetchDataTCP call

16

Client Server

slide-17
SLIDE 17

OOB Design (protocol) Transfer is complete

– TCP conn reused for 10 mins

17

Client Server

slide-18
SLIDE 18

OOB Design (client)

  • Cache-bypass-like threshold

– sysctl afs.oob_tcp_thresh

  • RXGEN_OPCODE server detection
  • Parameters tweakable via sysctl

18

slide-19
SLIDE 19

OOB Design (server)

  • libevent
  • Async connection receipt
  • TCP conns handed to Rx thread
  • TCP conn always after Rx call

19

slide-20
SLIDE 20

OOB Design (limitations)

  • Extra round-trip
  • Not extensible well
  • Client code organization

20

slide-21
SLIDE 21

Implementation Details

  • rx_FlushWriteNotLast
  • rxkad_Encrypt
  • rx_SetErrorProc
  • Linux configurable readahead
  • osi_BlockSignals

21

slide-22
SLIDE 22

Current Status

  • 1.4, 1.6, client, server
  • Linux now, but portable
  • Cache bypass
  • Zero-copy fetch
  • Non-standard protocol published
  • Source available on request

22

slide-23
SLIDE 23

Performance

23

slide-24
SLIDE 24

Performance

  • 10GigE

– Write: 300s MiB/s (~3gbps) memcache – Read: 500s MiB/s (~4-5gbps) memcache – Read: 700s MiB/s (~6gbps) bypass

  • afscp even higher with high chunksize
  • Same results when capped at 7gbps

24

slide-25
SLIDE 25

Performance

  • Major limiting factors:

– Chunksize (or bypass readahead size) – Cache overhead – Extra RTT

  • Benchmarks are “remote”

25

slide-26
SLIDE 26

Future Directions

  • Standardize new OOB protocol

– Extra RTT, UUIDs, cap bit, ext-union

  • Platform support
  • Volserver OOB

26

slide-27
SLIDE 27

Questions?

27

slide-28
SLIDE 28

Thanks!

28

Andrew Deason Sine Nomine Associates adeason@sinenomine.net