Building socket-aware BPF programs Joe Stringer Cilium.io Linux - - PowerPoint PPT Presentation

building socket aware bpf programs
SMART_READER_LITE
LIVE PREVIEW

Building socket-aware BPF programs Joe Stringer Cilium.io Linux - - PowerPoint PPT Presentation

Building socket-aware BPF programs Joe Stringer Cilium.io Linux Plumbers 2018, Vancouver, BC Joe Stringer BPF Socket Lookup Nov 13, 2018 1 / 32 Joe Stringer BPF Socket Lookup Nov 13, 2018 2 / 32 Background Network Policy Endpoint A


slide-1
SLIDE 1

Building socket-aware BPF programs

Joe Stringer

Cilium.io

Linux Plumbers 2018, Vancouver, BC

Joe Stringer BPF Socket Lookup Nov 13, 2018 1 / 32

slide-2
SLIDE 2

Joe Stringer BPF Socket Lookup Nov 13, 2018 2 / 32

slide-3
SLIDE 3

Background

Network Policy

”Endpoint A can talk to endpoint B” = ⇒ ”Endpoint B can reply to endpoint A”

Joe Stringer BPF Socket Lookup Nov 13, 2018 3 / 32

slide-4
SLIDE 4

Background

How have we built these before?

Joe Stringer BPF Socket Lookup Nov 13, 2018 4 / 32

slide-5
SLIDE 5

Background

Let’s do this with BPF

Attach BPF to packet hook ✓ “Connection Tracking” BPF map ✓

Key by 5-tuple Associate counters, NAT state, etc. Handle tuple flipping

“Policy” map ✓ Deploy! ✓

Joe Stringer BPF Socket Lookup Nov 13, 2018 5 / 32

slide-6
SLIDE 6

Background

Let’s do this with BPF

Attach BPF to packet hook ✓ “Connection Tracking” BPF map ✓

Key by 5-tuple Associate counters, NAT state, etc. Handle tuple flipping

“Policy” map ✓ Deploy! ✗

nf_conntrack: table full, dropping packet Hmm, how big should this map be again? How do we clean this up. . .

Joe Stringer BPF Socket Lookup Nov 13, 2018 6 / 32

slide-7
SLIDE 7

Background

Why model it like this?

Firewalls might not be co-located with the workload Firewalls should drop packets as quickly as possible Network stacks may be delicate flowers Solution? Build up state on-demand while processing packets

Joe Stringer BPF Socket Lookup Nov 13, 2018 7 / 32

slide-8
SLIDE 8

Background

Recent trends

Joe Stringer BPF Socket Lookup Nov 13, 2018 8 / 32

slide-9
SLIDE 9

Socket-based firewalling

If we’re co-located with the sockets . . .

. . . why build our own connection table?

Joe Stringer BPF Socket Lookup Nov 13, 2018 9 / 32

slide-10
SLIDE 10

Socket-based firewalling

Socket table as a connection tracker

Joe Stringer BPF Socket Lookup Nov 13, 2018 10 / 32

slide-11
SLIDE 11

Socket-based firewalling

Socket safety

Sockets are reference-counted internally

Some memory-management under RCU rules

BPF_PROG_TYPE_CGROUP_SOCK

Access safety via reference held across BPF execution Bounds safety provided via bounds access checker

Packet hooks may execute before associated socket is known

Need to handle reference counting

Joe Stringer BPF Socket Lookup Nov 13, 2018 11 / 32

slide-12
SLIDE 12

Extending the BPF verifier

Joe Stringer BPF Socket Lookup Nov 13, 2018 12 / 32

slide-13
SLIDE 13

Extending the BPF verifier Joe Stringer BPF Socket Lookup Nov 13, 2018 13 / 32

slide-14
SLIDE 14

Extending the BPF verifier

BPF verifier: Recap

At load time, loop over all instructions

Validate pointer access Ensure no loops . . .

Access memory out of bounds? ✗ Loops forever? ✗ Everything safe? ✓

Joe Stringer BPF Socket Lookup Nov 13, 2018 14 / 32

slide-15
SLIDE 15

Extending the BPF verifier

Socket reference counting

Implicit struct bpf_sock *sk; sk = bpf_sk_lookup(. . . ); if (sk) { . . . } /* Kernel will free ‘sk’ */ Explicit (mainline) struct bpf_sock *sk; sk = bpf_sk_lookup(. . . ); if (sk) { . . . bpf_sk_release(sk); }

Joe Stringer BPF Socket Lookup Nov 13, 2018 15 / 32

slide-16
SLIDE 16

Extending the BPF verifier

Reference counting in the BPF verifier

1 Resource acquisition 2 Execution paths while resource is held 3 Resource release

Joe Stringer BPF Socket Lookup Nov 13, 2018 16 / 32

slide-17
SLIDE 17

Extending the BPF verifier

Reference acquisition

Resource values are not known!

This is the verifier, not the runtime

Generate an identifier Store the identifier in the verifier state Associate the register with the identifier

Joe Stringer BPF Socket Lookup Nov 13, 2018 17 / 32

slide-18
SLIDE 18

Extending the BPF verifier

Reference misuse

Mangle and release bpf_tail_call() BPF_LD_ABS, BPF_LD_IND

Joe Stringer BPF Socket Lookup Nov 13, 2018 18 / 32

slide-19
SLIDE 19

Extending the BPF verifier

Reference release

Validation of pointers Remove identifier reference from state Unassociate register identifier associations

Joe Stringer BPF Socket Lookup Nov 13, 2018 19 / 32

slide-20
SLIDE 20

Extending the BPF API

Joe Stringer BPF Socket Lookup Nov 13, 2018 20 / 32

slide-21
SLIDE 21

Extending the BPF API

Simplest form

struct bpf_sock *bpf_sk_lookup(struct sk_buff *); void bpf_sk_release(struct bpf_sock *);

Joe Stringer BPF Socket Lookup Nov 13, 2018 21 / 32

slide-22
SLIDE 22

Extending the BPF API

Namespaces

Joe Stringer BPF Socket Lookup Nov 13, 2018 22 / 32

slide-23
SLIDE 23

Extending the BPF API

Arbitrary socket lookup

Use any tuple for lookup Ease API across clsact, XDP Simplify packet mangle and lookup

Joe Stringer BPF Socket Lookup Nov 13, 2018 23 / 32

slide-24
SLIDE 24

Extending the BPF API

Extensibility

Allow influencing lookup behaviour

SO_REUSEPORT

Determine socket type support at load time

Socket type supported? Load the program Not supported? Reject the program

Joe Stringer BPF Socket Lookup Nov 13, 2018 24 / 32

slide-25
SLIDE 25

Extending the BPF API

Optimizations

Avoid reference counting Allow lookup using direct packet pointers

Joe Stringer BPF Socket Lookup Nov 13, 2018 25 / 32

slide-26
SLIDE 26

Extending the BPF API

Socket lookup API

struct bpf_sock * bpf_sk_lookup_tcp(void *ctx, struct bpf_sock_tuple *tuple, u32 tuple_size, u32 netns, u64 flags); struct bpf_sock * bpf_sk_lookup_udp(void *ctx, struct bpf_sock_tuple *tuple, u32 tuple_size, u32 netns, u64 flags); void bpf_sk_release(struct bpf_sock *sk);

Joe Stringer BPF Socket Lookup Nov 13, 2018 26 / 32

slide-27
SLIDE 27

Extending the BPF API

Socket lookup structures

struct bpf_sock_tuple { union { struct { __be32 saddr; __be32 daddr; __be16 sport; __be16 dport; } ipv4; struct { __be32 saddr[4]; __be32 daddr[4]; __be16 sport; __be16 dport; } ipv6; }; };

Joe Stringer BPF Socket Lookup Nov 13, 2018 27 / 32

slide-28
SLIDE 28

Extending the BPF API

Socket structure

struct bpf_sock { __u32 bound_dev_if; __u32 family; __u32 type; __u32 protocol; __u32 mark; __u32 priority; __u32 src_ip4; /* NBO */ __u32 src_ip6[4]; /* NBO */ __u32 src_port; /* NBO */ };

Joe Stringer BPF Socket Lookup Nov 13, 2018 28 / 32

slide-29
SLIDE 29

Epilogue

Joe Stringer BPF Socket Lookup Nov 13, 2018 29 / 32

slide-30
SLIDE 30

Epilogue

Use case: Network devices

Socket lookup from XDP Management traffic? Send up the stack Other traffic? Forward, route, load-balance

Joe Stringer BPF Socket Lookup Nov 13, 2018 30 / 32

slide-31
SLIDE 31

Epilogue

Future work

More socket attribute access Associate metadata with sockets More uses for reference tracking

Joe Stringer BPF Socket Lookup Nov 13, 2018 31 / 32

slide-32
SLIDE 32

Thank you

Joe Stringer joe@cilium.io

Joe Stringer BPF Socket Lookup Nov 13, 2018 32 / 32