P4, FPGA, HLS Domain-specific language - network data forwarding - - PowerPoint PPT Presentation
P4, FPGA, HLS Domain-specific language - network data forwarding - - PowerPoint PPT Presentation
P4, FPGA, HLS Domain-specific language - network data forwarding Standardized and maintained by P4 Language Consortium @ p4.org Independent of network protocols (VLAN, VxLAN, INT, ) and target architectures (ASIC, FPGA, NPU,
P4, FPGA, HLS
- Domain-specific language - network data forwarding
■ Standardized and maintained by P4 Language Consortium @ p4.org ■ Independent of network protocols (VLAN, VxLAN, INT, …) and target architectures (ASIC, FPGA, NPU, CPU) ■ Match + Action user-defined packet processing ■ Custom hardware accelerators as externs/instructions
- FPGA as a P4 target
■ ASIC-like performance, deterministic, low-latency processing ■ Higher performance per watt than NPU or CPU ■ Flexibility beyond ASICs (new instructions or externs)
- HLS for P4 to FPGA
■ Unlocks NPU-like programming with ASIC-like performance ■ >10x higher productivity of HLS vs HDL
2
P4 code example
Headers
header_type ethernet_t { fields { dstAddr : 48; srcAddr : 48; etherType : 16; } } header ethernet_t ethernet; header_type ipv4_t { fields { version : 4; ihl : 4; diffserv : 8; totalLen : 16; identification : 16; flags : 3; fragOffset : 13; ttl : 8; protocol : 8; hdrChecksum : 16; srcAddr : 32; dstAddr : 32; } } header ipv4_t ipv4;
Parse graph
#define ETHT_IPV4 0x0800 #define ETHT_IPV6 0x86dd parser start { extract(ethernet); return select(latest.etherType) { ETHT_IPV4 : parse_ipv4; ETHT_IPV6 : parse_ipv6; } } parser parse_ipv4 { extract(ipv4); return ingress; } parser parse_ipv6 { extract(ipv6); return ingress; }
Tables
table tab1 { reads { ipv4.identification : exact; ipv4.srcAddr : lpm; ipv4.totalLen : range; ipv4.dstAddr : lpm; } actions { route_ipv4; no_op; } max_size : 128; } table tab2 { reads { ipv4.dstAddr : lpm; ethernet.srcAddr : range; ipv4.identification : ternary; } actions { drop; no_op; } }
3
P4 code example
Actions
action route_ipv4(dst_mac, src_mac, vid) { modify_field(ethernet.dst_addr, dst_mac); modify_field(ethernet.src_addr, src_mac); modify_field(vlan_tag.vid, vid); add_to_field(ipv4.ttl,-1); } action no_op() { }
Control flow
control ingress { apply(tab1); apply(tab2) { miss { apply(tab3); } } if (valid(vlan_tag[0])) { apply(port_vlan) { hit { … } } } ... } control egress { ... }
4
P4 pipeline
O U T P U T I N P U T P A R S E R Queues Buffers M+A Table M+A Table
parser match_action deparser
5
D E P A R S E R
Parser in VHDL
- Two configurations:
■ full - Ethernet, 2×VLAN, 2×MPLS, IPv4/IPv6 (with2×extension headers), TCP/UDP, ICMP/ICMPv6 ■ simple L2 - Ethernet, IPv4/IPv6 (with 2×extension headers), TCP/UDP, ICMP/ICMPv6
- Search of best implementation
■ All possible placements of pipeline among protocol analyzers ■ Two data bus widths 256, 512 ■ O0 - no optimization ■ O1 - offset width ■ O2 - +offset alignment ■ O3 - reworked P4 code ■ O4 - all combined
6
Pareto sets
7
Use Case: 100G In-band Network Telemetry
- Collection and reporting of network state by inserting,
analyzing, and stripping INT headers
- P4 to FPGA HLS used to generate firmware for
Netcope 100G programmable NIC
- No need to use HDL (hardware description language)
- Ultra-fast time-to-market and wire-speed traffic
processing at 100Gbps
8