the theory behind the the theory behind the z
play

The Theory Behind the The Theory Behind the z/Architecture Sort - PowerPoint PPT Presentation

The Theory Behind the The Theory Behind the z/Architecture Sort z/Architecture Sort Assist Instructions Assist Instructions SHARE in San Jose SHARE in San Jose August 10 - 15, 2008 August 10 - 15, 2008 Session 8121 Session 8121 Michael


  1. The Theory Behind the The Theory Behind the z/Architecture Sort z/Architecture Sort Assist Instructions Assist Instructions SHARE in San Jose SHARE in San Jose August 10 - 15, 2008 August 10 - 15, 2008 Session 8121 Session 8121 Michael Stack Michael Stack NEON Enterprise NEON Enterprise Software, Inc. Software, Inc. 1

  2. Outline Outline A Brief Overview of Sorting Tournament Tree Selection Sort with Replacement A Note on Binary Tree Implementation Offset Value Codes Tournament Tree Replacement / Selection Sort with Offset Value Codes 2

  3. Outline (cont'd) Outline (cont'd) The Hardware Assist Instructions What Was Omitted? References Appendix: Proofs of the Unequal Code Theorem and the Equal Code Theorem 3

  4. Consumer Warning No. 1 Consumer Warning No. 1 While the operation of CFC and UPT is not difficult to memorize, learning why they work ( this session) and how to use them ( next session) can be a major challenge (it was for me, anyway) These two sessions should help you get started, but don't expect to fully understand the details on first encounter 4

  5. Consumer Warning No. 2 Consumer Warning No. 2 There are two theorems in this session: the Unequal Code Theorem and the Equal Code Theorem Studying the logic of the proofs is probably the best way to learn how and why Offset Value Codes work So, the proofs are included as an appendix for later study 5

  6. Consumer Warning No. 3 Consumer Warning No. 3 The sort terminology in this 4 and presentation follows both Knuth 3 (see References) Iyer This presentation is based mostly on the paper by Iyer 3 , without which it could not have been prepared 6

  7. A Brief Overview of A Brief Overview of Sorting Sorting 7

  8. Overview of Sorting Overview of Sorting Multiple sort methods are used for sorting in a DBMS Slow sorts are O (N 2 ) Fast sorts are O (Nlg 2 N) Fastest sorts - O (N) - are Distribution Sorts, such as radix sort (good if keys are not too long) For more about "Big O " notation, see http://www.nist.gov/dads /HTML/bigOnotation.html 8

  9. Overview of Sorting Overview of Sorting 4 identifies five sort categories Knuth Insertion (Straight, Shellsort) Exchange (Bubble, Quicksort) Selection (Straight, Tree, Heapsort) Merge (Straight, Two-Way, List) Distribution (Radix List) Our focus will be on a variation of selection sort called tournament tree based replacement/selection sort 9

  10. Overview of Sorting Overview of Sorting In what follows, it is assumed WLOG that we are sorting in ascending sequence This means a key of lower value "wins" over a key of higher value For descending sequence, some changes must be made Also, we assume no duplicate keys 10

  11. Tournament Tree Tournament Tree Selection Sort with Selection Sort with Replacement Replacement The Theory Behind UPT The Theory Behind UPT 11

  12. Tournament / Selection Sort Tournament / Selection Sort Introduction Introduction In the examples in this section, we will use 16 numbers chosen at random by Knuth on March 19, 1963: 503 087 512 061 908 170 897 275 653 426 154 509 612 677 765 703 Our first example will show Straight Selection Then we will show Quadratic Selection, an easy improvement 12

  13. Tournament / Selection Sort: Tournament / Selection Sort: Straight Selection Straight Selection 503 087 512 061 908 170 897 275 653 426 154 509 612 677 765 703 ----------------------------------------------------------------------------------------- 503 087 512 061 908 170 897 275 653 426 154 509 612 677 765 703 061 087 512 503 908 170 897 275 653 426 154 509 612 677 765 703 061 087 512 503 908 170 897 275 653 426 154 509 612 677 765 703 061 087 154 503 908 170 897 275 653 426 512 509 612 677 765 703 061 087 154 170 908 503 897 275 653 426 512 509 612 677 765 703 061 087 154 170 275 503 897 908 653 426 512 509 612 677 765 703 ... For each key, K i , we scan right for smaller keys After comparing with all other keys, we exchange with smallest We repeat this for each key from K 1 to K N-1 13

  14. Tournament / Selection Sort: Tournament / Selection Sort: Straight Selection Straight Selection With all those comparisons, is it any wonder 2 + 3Nlg 2 N that Straight Selection takes 2.5N units of running time (according to Knuth 4 )? In fact, every algorithm for finding the maximum of N elements, based on comparing pairs of elements, must make at least N-1 comparisons Happily, that rule applies only to the first step (that's important!) 14

  15. Tournament / Selection Sort: Tournament / Selection Sort: Quadratic Selection Quadratic Selection We can improve on this by "remembering" the comparisons For example, we can first group the N keys into sqrt(N) groups of sqrt(N) elements 503 087 512 061 908 170 897 275 653 426 154 509 612 677 765 703 Then we need only compare the "winners" from each group, picking a new winner at each pass 15

  16. Tournament / Selection Sort: Tournament / Selection Sort: Quadratic Selection Quadratic Selection After a winner is chosen, we replace its value with a very large number - in this case, INF = +infinity ( + 4 ) which can never "win" A very important point: at each level we are dealing with pointers to records to be sorted, not the records themselves 16

  17. Tournament / Selection Sort: Tournament / Selection Sort: Quadratic Selection Quadratic Selection 061 170 154 612 503 087 512 061 908 170 897 275 653 426 154 509 612 677 765 703 (Here, sqrt(N) = 4 so 4 groups of 4 keys each) 17

  18. Tournament / Selection Sort: Tournament / Selection Sort: Quadratic Selection Quadratic Selection 061 170 154 612 503 087 512 061 908 170 897 275 653 426 154 509 612 677 765 703 087 170 154 612 503 087 512 INF 908 170 897 275 653 426 154 509 612 677 765 703 18

  19. Tournament / Selection Sort: Tournament / Selection Sort: Quadratic Selection Quadratic Selection 061 170 154 612 503 087 512 061 908 170 897 275 653 426 154 509 612 677 765 703 087 170 154 612 503 087 512 INF 908 170 897 275 653 426 154 509 612 677 765 703 503 170 154 612 503 INF 512 INF 908 170 897 275 653 426 154 509 612 677 765 703 19

  20. Tournament / Selection Sort: Tournament / Selection Sort: Quadratic Selection Quadratic Selection 061 170 154 612 503 087 512 061 908 170 897 275 653 426 154 509 612 677 765 703 087 170 154 612 503 087 512 INF 908 170 897 275 653 426 154 509 612 677 765 703 503 170 154 612 503 INF 512 INF 908 170 897 275 653 426 154 509 612 677 765 703 503 170 426 612 503 INF 512 INF 908 170 897 275 653 426 INF 509 612 677 765 703 20

  21. Tournament / Selection Sort: Tournament / Selection Sort: Quadratic Selection Quadratic Selection The advantage of quadratic selection is that only the group from which the previous winner was taken needs to be re-checked This can be extended to cubic and quartic selection The ultimate is "tree selection" 21

  22. Tournament / Selection Sort: Tournament / Selection Sort: Tree Selection - "Winner Tree" Tree Selection - "Winner Tree" Here is an example of tree selection showing a "winner" tree and path Only the leaf nodes have keys; the internal (upper) nodes are just pointers The dashed line separates leaf nodes from internal nodes 061 061 154 061 170 154 612 087 061 170 275 426 154 612 703 ----------------------------------------------------------------------------------------- 503 087 512 061 908 170 897 275 653 426 154 509 612 677 765 703 22

  23. Tournament / Selection Sort: Tournament / Selection Sort: Tree Selection - "Winner Tree" Tree Selection - "Winner Tree" How the winner tree is created: Each pair of keys on the bottom is compared A pointer to the winner (lower key) is placed in the row just above them This is repeated at each level until the winning key emerges as the root 061 061 154 061 170 154 612 087 061 170 275 426 154 612 703 ----------------------------------------------------------------------------------------- 503 087 512 061 908 170 897 275 653 426 154 509 612 677 765 703 23

  24. Tournament / Selection Sort: Tournament / Selection Sort: Tree Selection - "Winner Tree" Tree Selection - "Winner Tree" When the "winner" is removed, it is replaced by a "large key" (INF here) At each level, only one comparison is needed to select a new winner, so lg 2 N comparisons for each key, and Nlg 2 N comparisons, all told 087 087 154 087 170 154 612 087 512 170 275 426 154 612 703 ----------------------------------------------------------------------------------------- 503 087 512 INF 908 170 897 275 653 426 154 509 612 677 765 703 24

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend