P4, FPGA, HLS Domain-specific language - network data forwarding - - PowerPoint PPT Presentation

p4 fpga hls
SMART_READER_LITE
LIVE PREVIEW

P4, FPGA, HLS Domain-specific language - network data forwarding - - PowerPoint PPT Presentation

P4, FPGA, HLS Domain-specific language - network data forwarding Standardized and maintained by P4 Language Consortium @ p4.org Independent of network protocols (VLAN, VxLAN, INT, ) and target architectures (ASIC, FPGA, NPU,


slide-1
SLIDE 1
slide-2
SLIDE 2

P4, FPGA, HLS

  • Domain-specific language - network data forwarding

■ Standardized and maintained by P4 Language Consortium @ p4.org ■ Independent of network protocols (VLAN, VxLAN, INT, …) and target architectures (ASIC, FPGA, NPU, CPU) ■ Match + Action user-defined packet processing ■ Custom hardware accelerators as externs/instructions

  • FPGA as a P4 target

■ ASIC-like performance, deterministic, low-latency processing ■ Higher performance per watt than NPU or CPU ■ Flexibility beyond ASICs (new instructions or externs)

  • HLS for P4 to FPGA

■ Unlocks NPU-like programming with ASIC-like performance ■ >10x higher productivity of HLS vs HDL

2

slide-3
SLIDE 3

P4 code example

Headers

header_type ethernet_t { fields { dstAddr : 48; srcAddr : 48; etherType : 16; } } header ethernet_t ethernet; header_type ipv4_t { fields { version : 4; ihl : 4; diffserv : 8; totalLen : 16; identification : 16; flags : 3; fragOffset : 13; ttl : 8; protocol : 8; hdrChecksum : 16; srcAddr : 32; dstAddr : 32; } } header ipv4_t ipv4;

Parse graph

#define ETHT_IPV4 0x0800 #define ETHT_IPV6 0x86dd parser start { extract(ethernet); return select(latest.etherType) { ETHT_IPV4 : parse_ipv4; ETHT_IPV6 : parse_ipv6; } } parser parse_ipv4 { extract(ipv4); return ingress; } parser parse_ipv6 { extract(ipv6); return ingress; }

Tables

table tab1 { reads { ipv4.identification : exact; ipv4.srcAddr : lpm; ipv4.totalLen : range; ipv4.dstAddr : lpm; } actions { route_ipv4; no_op; } max_size : 128; } table tab2 { reads { ipv4.dstAddr : lpm; ethernet.srcAddr : range; ipv4.identification : ternary; } actions { drop; no_op; } }

3

slide-4
SLIDE 4

P4 code example

Actions

action route_ipv4(dst_mac, src_mac, vid) { modify_field(ethernet.dst_addr, dst_mac); modify_field(ethernet.src_addr, src_mac); modify_field(vlan_tag.vid, vid); add_to_field(ipv4.ttl,-1); } action no_op() { }

Control flow

control ingress { apply(tab1); apply(tab2) { miss { apply(tab3); } } if (valid(vlan_tag[0])) { apply(port_vlan) { hit { … } } } ... } control egress { ... }

4

slide-5
SLIDE 5

P4 pipeline

O U T P U T I N P U T P A R S E R Queues Buffers M+A Table M+A Table

parser match_action deparser

5

D E P A R S E R

slide-6
SLIDE 6

Parser in VHDL

  • Two configurations:

■ full - Ethernet, 2×VLAN, 2×MPLS, IPv4/IPv6 (with2×extension headers), TCP/UDP, ICMP/ICMPv6 ■ simple L2 - Ethernet, IPv4/IPv6 (with 2×extension headers), TCP/UDP, ICMP/ICMPv6

  • Search of best implementation

■ All possible placements of pipeline among protocol analyzers ■ Two data bus widths 256, 512 ■ O0 - no optimization ■ O1 - offset width ■ O2 - +offset alignment ■ O3 - reworked P4 code ■ O4 - all combined

6

slide-7
SLIDE 7

Pareto sets

7

slide-8
SLIDE 8

Use Case: 100G In-band Network Telemetry

  • Collection and reporting of network state by inserting,

analyzing, and stripping INT headers

  • P4 to FPGA HLS used to generate firmware for

Netcope 100G programmable NIC

  • No need to use HDL (hardware description language)
  • Ultra-fast time-to-market and wire-speed traffic

processing at 100Gbps

8

slide-9
SLIDE 9

Thank you for your attention!

Pavel Benáček Lukáš Richter Viktor Puš Michal Kekely Try it out at np4.netcope.com!