HIERARCHICAL QOS HARDWARE OFFLOAD Yossi Kuperman, Maxim - - PowerPoint PPT Presentation

hierarchical qos hardware offload
SMART_READER_LITE
LIVE PREVIEW

HIERARCHICAL QOS HARDWARE OFFLOAD Yossi Kuperman, Maxim - - PowerPoint PPT Presentation

HIERARCHICAL QOS HARDWARE OFFLOAD Yossi Kuperman, Maxim Mikityanskiy, 2020 AGENDA Hierarchical Token Bucket Brief description of HTB and its issues HTB offload solution Modifications to HTB to solve the issues and offload the logic Current


slide-1
SLIDE 1

Yossi Kuperman, Maxim Mikityanskiy, 2020

HIERARCHICAL QOS HARDWARE OFFLOAD

slide-2
SLIDE 2

2

Hierarchical Token Bucket

Brief description of HTB and its issues

HTB offload solution

Modifications to HTB to solve the issues and offload the logic

Current status

Known challenges and status of development and submission

AGENDA

slide-3
SLIDE 3

3

HTB

Shaping occurs in leaf nodes Child nodes borrow tokens from parents Classification:

Hierarchical Token Bucket

root inner leaf leaf leaf leaf node

rate ceil

# tc filter add dev eth0 parent 1:0 protocol ip flower dst_port 80 classid 1:10

slide-4
SLIDE 4

4

HTB DRAWBACKS

Single HTB instance, single lock, not aware of multi-queue netdevs

  • 1. Contention by flow classification
  • 2. Contention by handling packets

TXQ TXQ TXQ dev_queue_xmit() HTB qdisc TX lock TX lock TX lock Driver

slide-5
SLIDE 5

5

SOLUTION FOR CLASSIFICATION

Classification takes place at the clsact hook HTB skips classification if priority “points” to a class For example, replace:

# tc filter add dev eth0 parent 1:0 protocol ip flower dst_port 80 classid 1:10

with an equivalent filter using skbedit action:

# tc filter add dev eth0 egress protocol ip flower dst_port 80 action skbedit priority 1:10

Thread-safe and lock-free classification

Flow classification still takes place in software

slide-6
SLIDE 6

6

HTB will present itself as mq/mqprio does

  • Create simple qdisc (FIFO) per TX queue
  • Only when offload mode is set

HTB serves as the root qdisc

  • Aggregate statistics and report to user
  • Delegate the requests to the driver

HTB code is no longer part of the data-path

REMOVING THE LOCK CONTENTION

TXQ TXQ TXQ dev_queue_xmit() TX lock TX lock TX lock Driver FIFO qdisc FIFO qdisc FIFO qdisc

slide-7
SLIDE 7

7

HARDWARE OFFLOAD

HTB uses ndo_setup_tc to provide the QoS tree structure to the driver, which recreates it in the NIC All streams don’t have to fight for a single lock anymore

  • 1. HTB registers as a multi-queue qdisc (like mq) and creates qdiscs per queue
  • 2. Each leaf class is backed by a hardware queue
  • 3. Clsact happens before ndo_select_queue, so the driver can pick a queue

corresponding to the class

  • 4. Rate limiting is performed by the hardware
slide-8
SLIDE 8

8

HARDWARE OFFLOAD

root inner leaf leaf leaf leaf SQ SQ SQ SQ

HTB NIC

leaf leaf leaf leaf inner root

slide-9
SLIDE 9

9

PACKET FLOW

  • 1. clsact sets skb->priority to a leaf class ID
  • 2. ndo_select_queue looks at skb->priority and picks the TX queue
  • 3. The SKB is enqueued into the per-queue qdisc of that TX queue
  • 4. The SKB is dequeued from the per-queue qdisc
  • 5. The driver puts the SKB into the hardware Send Queue
  • 6. The NIC does the shaping and transmits the packet
slide-10
SLIDE 10

10

HARDWARE OFFLOAD ADVANTAGES

No contention on a single lock: different traffic classes don’t interfere with each other, which allows for better throughput Rate limiting logic is offloaded to the NIC, reducing CPU load

slide-11
SLIDE 11

11

KNOWN CHALLENGES

Qdiscs of leaf classes are applied before HTB logic, when offloaded QoS TX queues have to be preallocated on alloc_etherdev_mqs Hardware queues are created and destroyed on demand real_num_tx_queues is changed by the driver when leaf classes change Deleting a leaf class may lead to gaps in TX queue numeration

slide-12
SLIDE 12

12

CURRENT STATUS

PoC patches for mlx5 (using sysfs for configuration) RFC was posted to netdev mailing list, showing the HTB offload interface

slide-13
SLIDE 13