CEPH WIRE PROTOCOL REVISITED CEPH WIRE PROTOCOL REVISITED MESSENGER - - PowerPoint PPT Presentation

ceph wire protocol revisited ceph wire protocol revisited
SMART_READER_LITE
LIVE PREVIEW

CEPH WIRE PROTOCOL REVISITED CEPH WIRE PROTOCOL REVISITED MESSENGER - - PowerPoint PPT Presentation

CEPH WIRE PROTOCOL REVISITED CEPH WIRE PROTOCOL REVISITED MESSENGER V2 MESSENGER V2 Ricardo Dias | rdias@suse.com FOSDEM'19 - Soware Defined Storage devroom OUTLINE OUTLINE What is the Ceph messenger Messenger API Messenger V1


slide-1
SLIDE 1

CEPH WIRE PROTOCOL REVISITED CEPH WIRE PROTOCOL REVISITED MESSENGER V2 MESSENGER V2

Ricardo Dias | FOSDEM'19 - Soware Defined Storage devroom rdias@suse.com

slide-2
SLIDE 2

OUTLINE OUTLINE

What is the Ceph messenger Messenger API Messenger V1 Limitations Messenger V2 Protocol

slide-3
SLIDE 3

WHAT IS THE CEPH MESSENGER? WHAT IS THE CEPH MESSENGER?

slide-4
SLIDE 4

WHAT IS THE CEPH MESSENGER? WHAT IS THE CEPH MESSENGER?

It's a wire-protocol specification;

slide-5
SLIDE 5

WHAT IS THE CEPH MESSENGER? WHAT IS THE CEPH MESSENGER?

It's a wire-protocol specification; and also, the corresponding soware implementation

slide-6
SLIDE 6

WHAT IS THE CEPH MESSENGER? WHAT IS THE CEPH MESSENGER?

It's a wire-protocol specification; and also, the corresponding soware implementation Invisible to end-users

slide-7
SLIDE 7

WHAT IS THE CEPH MESSENGER? WHAT IS THE CEPH MESSENGER?

It's a wire-protocol specification; and also, the corresponding soware implementation Invisible to end-users Unless when it's not working properly

slide-8
SLIDE 8

WHAT IS THE CEPH MESSENGER? WHAT IS THE CEPH MESSENGER?

It's a wire-protocol specification; and also, the corresponding soware implementation Invisible to end-users Unless when it's not working properly The messenger knows nothing about the Ceph distributed algorithms and specific daemons protocols

slide-9
SLIDE 9

WHERE CAN WE FIND IT? WHERE CAN WE FIND IT?

slide-10
SLIDE 10

WHERE CAN WE FIND IT? WHERE CAN WE FIND IT?

slide-11
SLIDE 11

CEPH MESSENGER (1/2) CEPH MESSENGER (1/2)

slide-12
SLIDE 12

CEPH MESSENGER (1/2) CEPH MESSENGER (1/2)

Messenger is used as a "small" communication library by the other Ceph libraries/daemons

slide-13
SLIDE 13

CEPH MESSENGER (1/2) CEPH MESSENGER (1/2)

Messenger is used as a "small" communication library by the other Ceph libraries/daemons It can be used as both server and client Ceph daemons (osd, mon, mgr, mds) act as both servers and clients Ceph clients (rbd, rgw) act as clients

slide-14
SLIDE 14

CEPH MESSENGER (2/2) CEPH MESSENGER (2/2)

slide-15
SLIDE 15

CEPH MESSENGER (2/2) CEPH MESSENGER (2/2)

Abstracts the transport protocol of the physical connection used between machines Posix Sockets RDMA DPDK

slide-16
SLIDE 16

CEPH MESSENGER (2/2) CEPH MESSENGER (2/2)

Abstracts the transport protocol of the physical connection used between machines Posix Sockets RDMA DPDK Reliable delivery of messages with "exactly-once" semantics

slide-17
SLIDE 17

CEPH MESSENGER (2/2) CEPH MESSENGER (2/2)

Abstracts the transport protocol of the physical connection used between machines Posix Sockets RDMA DPDK Reliable delivery of messages with "exactly-once" semantics Automatic handling of temporary connection failures

slide-18
SLIDE 18

CEPH MESSENGER API CEPH MESSENGER API

class Messenger { int start(); int bind(const entity_addr_t& bind_addr); Connection *get_connection(const entity_inst_t& dest); // Dispatcher void add_dispatcher_head(Dispatcher *d); // server address entity_addr_t get_myaddr(); int get_mytype(); // Policy void set_default_policy(Policy p); void set_policy(int type, Policy p); }; class Connection { bool is_connected(); int send_message(Message *m); void send_keepalive(); void mark_down(); entity_addr_t get_peer_addr() const; int get_peer_type() const; };

slide-19
SLIDE 19

CEPH MESSENGER API CEPH MESSENGER API

class Messenger { Connection *get_connection(const entity_inst_t& dest); // Dispatcher void add_dispatcher_head(Dispatcher *d); }; class Connection { int send_message(Message *m); void mark_down(); };

slide-20
SLIDE 20

CEPH MESSENGER API CEPH MESSENGER API

class Dispatcher { // Message handling bool ms_can_fast_dispatch(const Message *m) const; void ms_fast_dispatch(Message *m); bool ms_dispatch(Message *m); // Connection handling void ms_handle_connect(Connection *con); void ms_handle_fast_connect(Connection *con); void ms_handle_accept(Connection *con); void ms_handle_fast_accept(Connection *con); bool ms_handle_reset(Connection *con); void ms_handle_remote_reset(Connection *con); bool ms_handle_refused(Connection *con); // Authorization handling bool ms_get_authorizer(int peer_type, AuthAuthorizer **a); bool ms_handle_authentication(Connection *con); };

slide-21
SLIDE 21

CEPH MESSENGER API CEPH MESSENGER API

class Dispatcher { // Message handling bool ms_dispatch(Message *m); // Connection handling void ms_handle_accept(Connection *con); // Authorization handling bool ms_get_authorizer(int peer_type, AuthAuthorizer **a); bool ms_handle_authentication(Connection *con); };

slide-22
SLIDE 22

MESSENGER V1 WIRE PROTOCOL MESSENGER V1 WIRE PROTOCOL

slide-23
SLIDE 23

MESSENGER V1 WIRE PROTOCOL MESSENGER V1 WIRE PROTOCOL

The first wire-protocol of Ceph

slide-24
SLIDE 24

MESSENGER V1 WIRE PROTOCOL MESSENGER V1 WIRE PROTOCOL

The first wire-protocol of Ceph No extensibility at an early stage of the protocol

slide-25
SLIDE 25

MESSENGER V1 WIRE PROTOCOL MESSENGER V1 WIRE PROTOCOL

The first wire-protocol of Ceph No extensibility at an early stage of the protocol No data authenticity supported

slide-26
SLIDE 26

MESSENGER V1 WIRE PROTOCOL MESSENGER V1 WIRE PROTOCOL

The first wire-protocol of Ceph No extensibility at an early stage of the protocol No data authenticity supported No data encryption supported

slide-27
SLIDE 27

MESSENGER V1 WIRE PROTOCOL MESSENGER V1 WIRE PROTOCOL

The first wire-protocol of Ceph No extensibility at an early stage of the protocol No data authenticity supported No data encryption supported Limited support for different authentication protocols

slide-28
SLIDE 28

MESSENGER V1 WIRE PROTOCOL MESSENGER V1 WIRE PROTOCOL

The first wire-protocol of Ceph No extensibility at an early stage of the protocol No data authenticity supported No data encryption supported Limited support for different authentication protocols No strict structure for protocol internal messages

slide-29
SLIDE 29

MESSENGER V2 WIRE PROTOCOL (1/2) MESSENGER V2 WIRE PROTOCOL (1/2)

slide-30
SLIDE 30

MESSENGER V2 WIRE PROTOCOL (1/2) MESSENGER V2 WIRE PROTOCOL (1/2)

By default is available on the IANA port 3300 in Ceph Monitors Messenger V1 will still be available through port 6789

slide-31
SLIDE 31

MESSENGER V2 WIRE PROTOCOL (1/2) MESSENGER V2 WIRE PROTOCOL (1/2)

By default is available on the IANA port 3300 in Ceph Monitors Messenger V1 will still be available through port 6789 Only Ceph Nautilus userspace libraries support V2 Ceph kernel modules still talk V1

slide-32
SLIDE 32

MESSENGER V2 WIRE PROTOCOL (1/2) MESSENGER V2 WIRE PROTOCOL (1/2)

By default is available on the IANA port 3300 in Ceph Monitors Messenger V1 will still be available through port 6789 Only Ceph Nautilus userspace libraries support V2 Ceph kernel modules still talk V1 Still in development as Nautilus has not been released yet

slide-33
SLIDE 33

MESSENGER V2 WIRE PROTOCOL (2/2) MESSENGER V2 WIRE PROTOCOL (2/2)

slide-34
SLIDE 34

MESSENGER V2 WIRE PROTOCOL (2/2) MESSENGER V2 WIRE PROTOCOL (2/2)

Complete redesign and implementation

slide-35
SLIDE 35

MESSENGER V2 WIRE PROTOCOL (2/2) MESSENGER V2 WIRE PROTOCOL (2/2)

Complete redesign and implementation Extensible protocol A different path can be taken in a very early stage of the protocol

slide-36
SLIDE 36

MESSENGER V2 WIRE PROTOCOL (2/2) MESSENGER V2 WIRE PROTOCOL (2/2)

Complete redesign and implementation Extensible protocol A different path can be taken in a very early stage of the protocol No limitations on the authentication protocols used

slide-37
SLIDE 37

MESSENGER V2 WIRE PROTOCOL (2/2) MESSENGER V2 WIRE PROTOCOL (2/2)

Complete redesign and implementation Extensible protocol A different path can be taken in a very early stage of the protocol No limitations on the authentication protocols used Encryption-on-the-wire support

slide-38
SLIDE 38

MESSENGER V2 SPECIFICATION MESSENGER V2 SPECIFICATION

slide-39
SLIDE 39

Actors: Connector Accepter

MESSENGER V2 SPECIFICATION MESSENGER V2 SPECIFICATION

slide-40
SLIDE 40

Actors: Connector Accepter Phases

  • 1. Banner Exchange
  • 2. Authentication
  • 3. Session Handshake
  • 4. Message Exchange

MESSENGER V2 SPECIFICATION MESSENGER V2 SPECIFICATION

slide-41
SLIDE 41

MESSAGE FRAME MESSAGE FRAME

struct frame { uint32_t frame_len; // 4 bytes uint32_t tag; // 4 byts char payload[frame_len - 4]; }; struct encrypted_frame { uint32_t frame_len; uint32_t tag; char encrypted_payload[frame_len - 4]; };

slide-42
SLIDE 42
  • 1. BANNER EXCHANGE
  • 1. BANNER EXCHANGE

connector accepter connection established banner banner We can change the behavior of the protocol at this point based on the supported/required features hello hello

struct banner { char banner[8]; // "ceph v2\n" uint16_t payload_len; struct banner_payload pyload; }; struct banner_payload { uint64_t supported_features; uint64_t required_features; } struct hello { uint8_t entity_type; entity_addr_t peer_address; }

slide-43
SLIDE 43
  • 2. AUTHENTICATION
  • 2. AUTHENTICATION

connector accepter auth_request auth_bad_method auth_request auth_reply_more auth_request_more several rounds auth_done From this point message frames can be encrypted

struct auth_request { uint32_t method; uint32_t preferred_modes[num_modes]; char auth_payload[payload_len]; } struct auth_bad_method { uint32_t method; int result; uint32_t allowed_methods[num_methods]; uint32_t allowed_modes[num_modes]; }; struct auth_reply_more { char auth_payload[payload_len]; }; struct auth_request_more { char auth_payload[payload_len]; }; struct auth_done { uint64_t global_id; uint32_t mode; char auth_payload[payload_len]; };

slide-44
SLIDE 44
  • 3. SESSION HANDSHAKE (NEW SESSION)
  • 3. SESSION HANDSHAKE (NEW SESSION)

connector accepter client_ident server_ident

struct client_ident { entity_addrvec_t addrs; int64_t global_id; uint64_t global_seq; uint64_t supported_features; uint64_t required_features; uint64_t flags; }; struct server_ident { entity_addrvec_t addrs; int64_t global_id; uint64_t global_seq; uint64_t supported_features; uint64_t required_features; uint64_t flags; uint64_t cookie; };

slide-45
SLIDE 45
  • 3. SESSION HANDSHAKE (RECONNECT)
  • 3. SESSION HANDSHAKE (RECONNECT)

connector accepter reconnect reconnect_ok

struct reconnect { entity_addrvec_t addrs; uint64_t cookie; uint64_t global_seq; uint64_t connect_seq; uint64_t msg_seq; }; struct reconnect_ok { uint64_t msg_seq; };

slide-46
SLIDE 46
  • 4. MESSAGE EXCHANGE
  • 4. MESSAGE EXCHANGE

connector accepter session establishment message message message message + ack(2) message + ack(2)

struct message { __u8 tag; // includes last seen msg seq ceph_msg_header2 header; char payload[front_len + middle_len] }; // TAGS CLOSE 6 // closing pipe MSG 7 // message ACK 8 // message ack KEEPALIVE2 14 // keepalive 2 KEEPALIVE2_ACK 15 // keepalive 2 reply

slide-47
SLIDE 47

FRAME INTEGRITY, AUHTENTICITY, AND FRAME INTEGRITY, AUHTENTICITY, AND CONFIDENTIALITY CONFIDENTIALITY

slide-48
SLIDE 48

FRAME INTEGRITY, AUHTENTICITY, AND FRAME INTEGRITY, AUHTENTICITY, AND CONFIDENTIALITY CONFIDENTIALITY

Integrity:

slide-49
SLIDE 49

FRAME INTEGRITY, AUHTENTICITY, AND FRAME INTEGRITY, AUHTENTICITY, AND CONFIDENTIALITY CONFIDENTIALITY

Integrity: CRC in frame header (length + tag)

slide-50
SLIDE 50

FRAME INTEGRITY, AUHTENTICITY, AND FRAME INTEGRITY, AUHTENTICITY, AND CONFIDENTIALITY CONFIDENTIALITY

Integrity: CRC in frame header (length + tag) CRC in messages payload (same as in V1)

slide-51
SLIDE 51

FRAME INTEGRITY, AUHTENTICITY, AND FRAME INTEGRITY, AUHTENTICITY, AND CONFIDENTIALITY CONFIDENTIALITY

Integrity: CRC in frame header (length + tag) CRC in messages payload (same as in V1) Authenticity and Confidentiality:

slide-52
SLIDE 52

FRAME INTEGRITY, AUHTENTICITY, AND FRAME INTEGRITY, AUHTENTICITY, AND CONFIDENTIALITY CONFIDENTIALITY

Integrity: CRC in frame header (length + tag) CRC in messages payload (same as in V1) Authenticity and Confidentiality: Frame payload only

slide-53
SLIDE 53

FRAME INTEGRITY, AUHTENTICITY, AND FRAME INTEGRITY, AUHTENTICITY, AND CONFIDENTIALITY CONFIDENTIALITY

Integrity: CRC in frame header (length + tag) CRC in messages payload (same as in V1) Authenticity and Confidentiality: Frame payload only Authenticity with SHA256 HMAC

slide-54
SLIDE 54

FRAME INTEGRITY, AUHTENTICITY, AND FRAME INTEGRITY, AUHTENTICITY, AND CONFIDENTIALITY CONFIDENTIALITY

Integrity: CRC in frame header (length + tag) CRC in messages payload (same as in V1) Authenticity and Confidentiality: Frame payload only Authenticity with SHA256 HMAC Confidentiality with AES encryption

slide-55
SLIDE 55

WHERE CAN I FIND THE CODE? WHERE CAN I FIND THE CODE?

slide-56
SLIDE 56

WHERE CAN I FIND THE CODE? WHERE CAN I FIND THE CODE?

Source code location: src/msg/async/ProtocolV2.cc

slide-57
SLIDE 57

WHERE CAN I FIND THE CODE? WHERE CAN I FIND THE CODE?

Source code location: src/msg/async/ProtocolV2.cc Specificaton dra: http://docs.ceph.com/docs/master/dev/msg

slide-58
SLIDE 58

FUTURE FEATURES FUTURE FEATURES

slide-59
SLIDE 59

FUTURE FEATURES FUTURE FEATURES

More authentication protocols: Kerberos, ...

slide-60
SLIDE 60

FUTURE FEATURES FUTURE FEATURES

More authentication protocols: Kerberos, ... Connection multiplexing

slide-61
SLIDE 61

FUTURE FEATURES FUTURE FEATURES

More authentication protocols: Kerberos, ... Connection multiplexing New ideas and contributions are welcome

slide-62
SLIDE 62

Q&A Q&A

slide-63
SLIDE 63