High-Throughput Linear Sorter System Jorge Ortiz David Andrews - PowerPoint PPT Presentation

A Configurable High-Throughput Linear Sorter System  Jorge Ortiz  David Andrews Information and Computer Science and Telecommunication Technology Computer Engineering Center The University of Arkansas 2335 Irving Hill Road 504 J.B. Hunt Building, Lawrence, KS Fayetteville, AR jorgeo@ku.edu dandrews@uark.edu

Introduction

Introduction  Sorting an important system function Popular sorting algorithms not efficient or fast in hardware implementations  Linear sorters ideal for hardware, but sort at a rate of 1 value per cycle  Sorting networks better at throughput, but with high area and latency cost  Need a better solution for high throughput, low latency sorting

Contributions  Expanding the linear sorter implementation and making it versatile, reconfigurable and better suited for streaming input and output  Parallelizing the linear sorter for increased throughput  Implementing the high-throughput linear sorter, and outmatching the performance of current linear sorter approaches

Background

Background  Software quicksort, mergesort and heapsort use divide-and-conquer techniques to achieve efficiency  Hardware sorting plagued with overhead from data movements, synchronization, bookkeeping and memory accesses  Need better use of concurrent data comparisons and swaps, rather than the extended execution of multiple assembly instructions like its software counterpart

Sorting Networks  Swap comparators sort pairs of values  Sink lowest value, then operate on remaining S n-1 items Bubble Sort  Receive parallel data at inputs 3 3  High #PE and 2 2 latency, resort with 5 5 each new insertion 4 4 1 1

Linear Sorters  Sorted insertions  Single clock latency, small logic & regular  Forwards incoming structure value to all nodes  Streaming input &  Each node shifts output autonomously depending on  Serial input, need neighbors’ values higher throughput Input: Output:

Linear Sorters  Sorted insertions  Single clock latency, small logic & regular  Forwards incoming structure value to all nodes  Streaming input &  Each node shifts output autonomously depending on  Serial input, need neighbors’ values higher throughput Input: 3 Output:

Linear Sorters  Sorted insertions  Single clock latency, small logic & regular  Forwards incoming structure value to all nodes  Streaming input &  Each node shifts output autonomously depending on  Serial input, need neighbors’ values higher throughput Input: 2 3 Output:

Linear Sorters  Sorted insertions  Single clock latency, small logic & regular  Forwards incoming structure value to all nodes  Streaming input &  Each node shifts output autonomously depending on  Serial input, need neighbors’ values higher throughput Input: 5 2 3 Output:

Linear Sorters  Sorted insertions  Single clock latency, small logic & regular  Forwards incoming structure value to all nodes  Streaming input &  Each node shifts output autonomously depending on  Serial input, need neighbors’ values higher throughput Input: 4 2 3 5 Output:

Linear Sorters  Sorted insertions  Single clock latency, small logic & regular  Forwards incoming structure value to all nodes  Streaming input &  Each node shifts output autonomously depending on  Serial input, need neighbors’ values higher throughput Input: 1 2 3 4 5 Output:

Linear Sorters  Sorted insertions  Single clock latency, small logic & regular  Forwards incoming structure value to all nodes  Streaming input &  Each node shifts output autonomously depending on  Serial input, need neighbors’ values higher throughput Input: 1 2 3 4 5 Output: 1 2 3 4 5

Configurable Linear Sorter

Configurable Linear Sorter  Increase versatility for linear sorters  Configurable: ◦ Linear sorter depth ◦ Sorting direction ◦ Sort on tags (for example, timestamps) rather than data ◦ User-defined data and tag size

Configurable Linear Sorter Increase functionality for linear sorters 1. Detect full conditions 2. Buffer input while full 3. Retrieve output serially for streaming 4. Delete top value, freeing nodes 5. Augment with left shift functionality 6. Test tags before deleting them

Extended Linear Sorter System Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 0 5

Extended Linear Sorter System Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 0 5 1 7 5

Extended Linear Sorter System Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 0 5 1 7 5 2 6 5 7

Extended Linear Sorter System Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 0 5 1 7 5 2 6 5 7 3 2 5 6 7

Extended Linear Sorter System Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 0 5 1 7 5 2 6 5 7 3 2 5 6 7 4 1 2 5 6 7

Extended Linear Sorter System Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 0 5 1 7 5 2 6 5 7 3 2 5 6 7 4 1 2 5 6 7 5 9 1 2 5 6 7

Extended Linear Sorter System Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 0 5 1 7 5 2 6 5 7 3 2 5 6 7 4 1 2 5 6 7 5 9 1 2 5 6 7 6 3 1 2 5 6 7 9

Extended Linear Sorter System Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 0 5 1 7 5 2 6 5 7 3 2 5 6 7 4 1 2 5 6 7 5 9 1 2 5 6 7 6 3 1 2 5 6 7 9 7 8 2 3 5 6 7 9

Extended Linear Sorter System Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 0 5 1 7 5 2 6 5 7 3 2 5 6 7 4 1 2 5 6 7 5 9 1 2 5 6 7 6 3 1 2 5 6 7 9 7 8 2 3 5 6 7 9 8 4 3 5 6 7 8 9

Interleaved Linear Sorter System Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 0 5 1 7 5 2 6 5 7 3 2 5 6 7 4 1 2 5 6 7 5 9 1 2 5 6 7 6 3 1 2 5 6 7 9 7 8 2 3 5 6 7 9 8 4 3 5 6 7 8 9 9 4 5 6 7 8 9

Interleaved Linear Sorter System Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 0 5 1 7 5 2 6 5 7 3 2 5 6 7 4 1 2 5 6 7 5 9 1 2 5 6 7 6 3 1 2 5 6 7 9 7 8 2 3 5 6 7 9 8 4 3 5 6 7 8 9 9 4 5 6 7 8 9 10 4 5 6 7 8 9

Interleaved Linear Sorter System Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 0 5 1 7 5 2 6 5 7 3 2 5 6 7 4 1 2 5 6 7 5 9 1 2 5 6 7 6 3 1 2 5 6 7 9 7 8 2 3 5 6 7 9 8 4 3 5 6 7 8 9 9 4 5 6 7 8 9 10 4 5 6 7 8 9 11 5 6 7 8 9

Extended Linear Sorter System Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 0 5 1 7 5 2 6 5 7 3 2 5 6 7 4 1 2 5 6 7 5 9 1 2 5 6 7 6 3 1 2 5 6 7 9 7 8 2 3 5 6 7 9 8 4 3 5 6 7 8 9 9 4 5 6 7 8 9 10 4 5 6 7 8 9 11 5 6 7 8 9 12 6 7 8 9

Extended Linear Sorter System Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 0 5 1 7 5 2 6 5 7 3 2 5 6 7 4 1 2 5 6 7 5 9 1 2 5 6 7 6 3 1 2 5 6 7 9 7 8 2 3 5 6 7 9 8 4 3 5 6 7 8 9 9 4 5 6 7 8 9 10 4 5 6 7 8 9 11 5 6 7 8 9 12 6 7 8 9 13 7 8 9

Extended Linear Sorter System Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 0 5 1 7 5 2 6 5 7 3 2 5 6 7 4 1 2 5 6 7 5 9 1 2 5 6 7 6 3 1 2 5 6 7 9 7 8 2 3 5 6 7 9 8 4 3 5 6 7 8 9 9 4 5 6 7 8 9 10 4 5 6 7 8 9 11 5 6 7 8 9 12 6 7 8 9 13 7 8 9 14 8 9

High-Throughput Linear Sorter System Jorge Ortiz David Andrews - PowerPoint PPT Presentation

A Configurable High-Throughput Linear Sorter System Jorge Ortiz David Andrews Information and Computer Science and Telecommunication Technology Computer Engineering Center The University of Arkansas 2335 Irving Hill Road 504 J.B.

High throughput High throughput kafka for science kafka for science Testing Kafkas limits

Evaluation of Improved Scalability Comparison points Throughput (IPC/Node)

A GPU-Inspired Soft Processor for High- Throughput Acceleration Throughput Acceleration Jeffrey

High Throughput Computing Notebooks HTCondor Week 2019 Todd Tannenbaum Center for High

Analyzing Throughput of GPUs Analyzing Throughput of GPUs Exploiting Within-Die Core-to-Core

The Bioconductor Project for Reproducible Analysis of High Throughput Genomic Data Martin Morgan

Detecting gene-gene interactions in high-throughput genotype data through a Bayesian clustering

Discovering Mammalian Endocytic Discovering Mammalian Endocytic Pathways with High- -Throughput

A new smart-pooling strategy for high-throughput screening: the Shifted Transversal Design

HTPMD High Throughput Parallel Molecular Dynamics Steve Cox RENCI Engagement Overview

Bioinformatics for High-Throughput Sequencing Misha Kapushesky St. Petersburg Russia 2010

A simple tool from a complex system: A simple tool from a complex system: high- -throughput,

Applicability of Free Energy Applicability of Free Energy Calculations using High-Throughput

StreamFlex High-throughput Stream Programming in Java Jesper Spring Jean Privat, Rachid

RCC: High-Throughput Secure Transaction Processing Shawn Qiu, Di Zhao Introduction Resilient

Eagle Scholars: High Eagle Scholars: High Eagle Scholars: High Eagle Scholars: High Eagle

Sorting Algorithms Having to sort a list is an issue that comes up all the time when you are

For Monday Read Weiss, chapter 7, sections 4-6 Homework: Elementary sorting homework

Sorting algorithms Ti ings to consider Theory vs Practice Algorithms vs Implementations

CS 310 Advanced Data Structures and Algorithms Sorting June 7, 2018 Mohammad Hadian

10 Chapter Exercises Searching and Sorting 10.1. Consider the following array of sorted integers:

Sorting methods Classification of sorting algorithms internal vs external internal:

CS4102 Algorithms Fall 2018 Warm up Build a Max Heap from the following Elements: 4, 15, 22, 6,

Problem Solving and Search Ulle Endriss Institute for Logic, Language and Computation University