IPsec encapsulation over TCP Sabrina Dubroca sd@queasysnail.net Red - - PowerPoint PPT Presentation

ipsec encapsulation over tcp
SMART_READER_LITE
LIVE PREVIEW

IPsec encapsulation over TCP Sabrina Dubroca sd@queasysnail.net Red - - PowerPoint PPT Presentation

IPsec encapsulation over TCP Sabrina Dubroca sd@queasysnail.net Red Hat netdev 0x13, 2019-03-22 1 / 22 Introduction 2 / 22 ESP and IKE Components of the IPsec protocol suite ESP Encapsulating Security Payload AH Authentication Header IKE


slide-1
SLIDE 1

IPsec encapsulation over TCP

Sabrina Dubroca sd@queasysnail.net

Red Hat

netdev 0x13, 2019-03-22

1 / 22

slide-2
SLIDE 2

Introduction

2 / 22

slide-3
SLIDE 3

ESP and IKE

◮ Components of the IPsec protocol suite ESP Encapsulating Security Payload AH Authentication Header IKE Internet Key Exchange ◮ Packet formats IP ESP Data

Figure: ESP packet format

IP UDP IKE header IKE payload

Figure: IKE over UDP port 500 packet format

3 / 22

slide-4
SLIDE 4

UDP encapsulation

◮ workaround for NAT: NAT needs ports to demultiplex inside/outside

8 12

UDP SPI SN Data

Figure: ESP over UDP packet format

8 12

UDP IKE header IKE payload

Figure: IKE over UDP port 4500 packet format

◮ RX node uses the first 32 bits of UDP payload to differentiate ESP/IKE

4 / 22

slide-5
SLIDE 5

More middlebox trouble

Some middleboxes don’t let anything other than TCP pass.

5 / 22

slide-6
SLIDE 6

RFC 8229: TCP Encapsulation of IKE and ESP Packets

6 / 22

slide-7
SLIDE 7

RFC 8229: packet formats

TCP stream, composed of concatenated messages

1 2 5 6

len SPI SN Data

Figure: ESP over TCP message format

1 2 5 6

len IKE header IKE payload

Figure: IKE over TCP message format

7 / 22

slide-8
SLIDE 8

RFC 8229: sender and receiver operations

◮ sender

◮ adds a prefix to each message

◮ for all messages: length (2 bytes) ◮ for IKE messages: non-ESP marker (like in UDP encapsulation)

◮ concatenates all messages within a TCP stream

◮ receiver

◮ parses TCP stream to extract messages ◮ differentiates ESP from IKE (SPI/non-ESP marker) ◮ handles each message

8 / 22

slide-9
SLIDE 9

Linux implementation

9 / 22

slide-10
SLIDE 10

XFRM

◮ infrastructure underneath linux kernel’s IPsec implementation ◮ applies transformations on packets

10 / 22

slide-11
SLIDE 11

Upper Layer Protocol (ULP)

◮ infrastracture to implement in-kernel protocols that live on top of TCP

◮ for example, TLS

◮ init callback triggered by the TCP ULP setsockopt #define ULPNAME <...> setsockopt(sock, SOL_TCP, TCP_ULP, ULPNAME, sizeof(ULPNAME)); ◮ init callback can change some of the socket’s operations

◮ replace operations with protocol-specific actions ◮ for example, sendmsg/recvmsg, close

11 / 22

slide-12
SLIDE 12

Stream Parser (strp)

◮ framework to parse messages out of a TCP stream ◮ main callbacks parse msg returns the length of the next message in the stream rcv msg processes the next message ◮ strp’s receive function

◮ triggered by arrival of new data on the TCP socket ◮ calls those operations

12 / 22

slide-13
SLIDE 13

RFC 8229 implementation

◮ uses ULP

◮ initializes a streamparser on the TCP socket ◮ redefines socket operations

◮ streamparser

◮ extracts messages, either IKE or ESP ◮ passes messages to userspace or XFRM

13 / 22

slide-14
SLIDE 14

RFC 8229 implementation: RX handling

◮ data arrives on TCP encap socket → strp rcv() → parse msg + rcv msg ◮ parse msg

◮ reads length field

◮ rcv msg

◮ reads SPI/non-ESP marker ◮ continues processing as ESP or IKE

◮ ESP messages → XFRM → decrypt/verify ◮ IKE messages → userspace queue → recv() (via custom recvmsg op)

14 / 22

slide-15
SLIDE 15

RFC 8229 implementation: TX handling

◮ IKE daemon → send() → custom sendmsg op → add length prefix → enqueue to TCP ◮ data packets → XFRM → add length prefix → enqueue to TCP

15 / 22

slide-16
SLIDE 16

RFC 8229 implementation: TX handling: interleaving of messages

msg1: partially sent

  • msg2: fully sent
  • ...

len msg1, N bytes len msg2, M bytes rest of msg1 ...

  • expected msg1
  • next len

Figure: Interleaving problem between IKE and ESP

◮ avoid interleaving messages over TCP

◮ temporary slot in front of the TCP socket ◮ keep that partially-sent message in the slot

16 / 22

slide-17
SLIDE 17

Userspace view

17 / 22

slide-18
SLIDE 18

Userspace API: creating XFRM states

◮ almost identical to UDP encap: s/espinudp/espintcp/

CPORT=<local port of the client socket> SPORT=4500 CADDR=<IP address of the client> SADDR=<IP address of the server> ip xfrm state add src $CADDR dst $SADDR proto esp spi $SPI1 \ aead ’rfc4106(gcm(aes))’ $KEY $ICVLEN \ mode transport sel src $CADDR dst $SADDR \ encap espintcp $CPORT $SPORT 0.0.0.0 ip xfrm state add src $SADDR dst $CADDR proto esp spi $SPI2 \ aead ’rfc4106(gcm(aes))’ $KEY $ICVLEN \ mode transport sel src $SADDR dst $CADDR \ encap espintcp $SPORT $CPORT 0.0.0.0

18 / 22

slide-19
SLIDE 19

Userspace API: client program

sock = socket(AF_INET, SOCK_STREAM, 0); struct xfrm_userpolicy_info policy = { .action = XFRM_POLICY_ALLOW, .sel.family = AF_INET, }; policy.dir = XFRM_POLICY_OUT; setsockopt(sock, IPPROTO_IP, IP_XFRM_POLICY, &policy, sizeof(policy)); policy.dir = XFRM_POLICY_IN; setsockopt(sock, IPPROTO_IP, IP_XFRM_POLICY, &policy, sizeof(policy)); connect(sock, ...); send(sock, "IKETCP", 6, 0); setsockopt(sock, SOL_TCP, TCP_ULP, "espintcp", sizeof("espintcp"));

19 / 22

slide-20
SLIDE 20

Socket behavior

◮ TCP socket, but behaves like a UDP socket once encapsulation is enabled ◮ length prefix added/removed by kernel ◮ non-ESP marker is userspace’s responsibility (identical to ESPINUDP encap) ◮ "IKETCP" prefix

◮ written sent by userspace client before enabling encapsulation ◮ read by userspace server (after accept()) before enabling encapsulation (otherwise, close connection)

◮ one send()= 1 IKE message

◮ MSG MORE not implemented

◮ one recv() = 1 IKE message

◮ recv buffer smaller than actual message ⇒ partial read, rest of the message dropped ◮ MSG PEEK returns the first N bytes of the message at the head of the receive queue

20 / 22

slide-21
SLIDE 21

Conclusion

21 / 22

slide-22
SLIDE 22

Remaining work

◮ Possible starvation issue between ESP and IKE ◮ Testing with IKE daemon ◮ Upstreaming

22 / 22