engineering a sort function
play

Engineering a Sort Function Jim Royer CIS 351 February 4, 2019 - PowerPoint PPT Presentation

Engineering a Sort Function Jim Royer CIS 351 February 4, 2019 Royer (CIS 351) Engineering a Sort Function February 4, 2019 1 / 16 Bentley and MacIlroy, 1993 Engineering a Sort Function JON L. BENTLEY M. DOUGLAS McILROY AT&T Bell


  1. Engineering a Sort Function Jim Royer CIS 351 February 4, 2019 Royer (CIS 351) Engineering a Sort Function February 4, 2019 1 / 16

  2. Bentley and MacIlroy, 1993 Engineering a Sort Function JON L. BENTLEY M. DOUGLAS McILROY AT&T Bell Laboratories, 600 Mountain Avenue, Murray Hill, NJ 07974, U.S.A. SUMMARY We recount the history of a new qsort function for a C library. Our function is clearer, faster and more robust than existing sorts. It chooses partitioning elements by a new sampling scheme; it partitions by a novel solution to Dijkstra’s Dutch National Flag problem; and it swaps efficiently. Its behavior was assessed with timing and debugging testbeds, and with a program to certify performance. The design techniques apply in domains beyond sorting. From: Software Practice and Experience , Vol. 23 (1993) 1249–1265. http://www.skidmore.edu/~meckmann/2009Spring/cs206/papers/spe862jb.pdf Royer (CIS 351) Engineering a Sort Function February 4, 2019 2 / 16

  3. Why rewrite Unix’s quicksort? In ancient days of yore ( ≈ 1991): The old quicksort ( qsort ) had be in use for ≈ 20 years and was stable and usually fast. A colleague found that qsort ran in Θ ( n 2 ) time inputs with certain structures, e.g., on pipe-organ arrays of 2 n integers: 1,2,3,4,...,n,n,...,4,3,2,1 . They found that all the then competitors of qsort could also be driven to Θ ( n 2 ) on certain reasonable inputs. “Users complain when easy inputs don’t sort quickly.” So it was time for a new systems-level quicksort. Royer (CIS 351) Engineering a Sort Function February 4, 2019 3 / 16

  4. Bentley and MacIlroy’s version of CLRS’s Quicksort void iqsort0(int *a, int n) { int i, j; if (n <= 1) return; for (i = 1, j = 0; i < n; i++) if (a[i] < a[0]) swap(++j, i, a); swap(0, j, a); iqsort0(a, j); iqsort0(a+j+1, n-j-1); } Program 2. A toy Quicksort, unfit for general use more efficient (and more familiar) partitioning method uses On nearly sorted arrays, the above makes ≈ n 2 2 many comparisons! Royer (CIS 351) Engineering a Sort Function February 4, 2019 4 / 16

  5. The qsort interface (for the moment) void qsort(char *a, int n, int es, int (*cmp)()); Parameters = the array’s starting location *a n = the number of elements = the size (in bytes) of each element es cmp = the comparison function For “ char* ” think “byte addresses.” Royer (CIS 351) Engineering a Sort Function February 4, 2019 5 / 16

  6. Insertion sort using the qsort interface isort void isort(char *a, int n, int es, int (*cmp)()) { char *pi, *pj; for (pi = a + es; pi < a + n*es; pi += es) for (pj = pi; pj > a && cmp(pj-es, pj) > 0; pj -= es) swap(pj, pj-es, es); } function swap(i,j,n) , defined in Program 1, interchanges n -byte fields pointed to A simple, straightforward, and troublesome swap void swap(char *i, char *j, int n) { do { char c = *i; *i++ = *j; *j++ = c; } while (--n > 0); } Royer (CIS 351) Engineering a Sort Function February 4, 2019 6 / 16

  7. Bentley and MacIlroy’s starting quicksort void qsort1(char *a, int n, int es, int (*cmp)()) { int j; char *pi, *pj, *pn; if (n <= 1) return; pi = a + (rand() % n) * es; swap(a, pi, es); pi = a; pj = pn = a + n * es; for (;;) { do pi += es; while (pi < pn && cmp(pi, a) < 0); do pj -= es; while (cmp(pj, a) > 0); if (pj < pi) break; swap(pi, pj, es); } swap(a, pj, es); j = (pj - a) / es; qsort1(a, j, es, cmp); qsort1(a + (j+1)*es, n-j-1, es, cmp); } Program 4. A simple qsort the cost of about forty common sorting operations. Table I shows the cost of Royer (CIS 351) Engineering a Sort Function February 4, 2019 7 / 16

  8. qsort1 ’s invariants As the partition process is running T ≤ T ? ≥ T 0 i j n-1 around the element a[0] , which we abbreviate as T . Increment When the partitioning is done ≤ T T ≥ T 0 j i n-1 recursively on the subarrays a[0..j-1] and a[j+1..n-1] Royer (CIS 351) Engineering a Sort Function February 4, 2019 8 / 16

  9. Cost of basic operations (on a VAX 8550) Table I On a modern CPU the actual CPU Microseconds times will be vastly smaller. Int Operations 0.20 i1 = i2 + i3 0.20 My guess that the proportions i1 = i2 - i3 are roughly the same. Pointer Operations 0.17 p1 -= es 0.16 Note that under “Swap p1 += es Control Structures Functions” both lines are about 0.32 if (p1 == p2) i1++ swapping 4-byte int s. 0.26 while (p1 < p2) i1++ Comparison Functions MIX (standard cost models): 2.37 i1 = intcomp(&i2, &i3) 3.67 overhead ≈ comparisons < swaps i1 = floatcomp(&f2, &f3) 3.90 i1 = dblcomp(&d2, &d3) 8.74 i1 = strcmp(s2, s3) qsort (what is going on here): Swap Functions overhead < swaps ∗ < comparisons 11.50 swap(p1, p2, 4) 0.84 t = *i1, *i1 = *i2, *i2 = t ∗ done right inline swaps for integer-sized objects and a function Royer (CIS 351) Engineering a Sort Function February 4, 2019 9 / 16

  10. Getting rid of the randomness — Median of three (Systems folk do not care for randomized algorithms.) a:b C n = expected number of comparisons for a size- n input < > when pivoting around a b:c b:c random elm: C n ≈ 1.386 n log 2 n < > < > when pivoting around the abc cba a:c a:c median of 3 random elms: C n ≈ 1.188 n log 2 n < > < > Program 5 makes 8/3 acb cab bac bca comparisons (on average). static char *med3(char *a, char *b, char *c, int (*cmp)()) { return cmp(a, b) < 0 ? (cmp(b, c) < 0 ? b : cmp(a, c) < 0 ? c : a) : (cmp(b, c) > 0 ? b : cmp(a, c) > 0 ? c : a); } Program 5. Decision tree and program for median of three Royer (CIS 351) Engineering a Sort Function February 4, 2019 10 / 16

  11. Getting rid of the randomness — Ninther ninther = a median of three medians, at most 12 comparisons pm = a + (n/2)*es; /* Small arrays, middle element */ if (n > 7) { pl = a; pn = a + (n-1)*es; if (n > 40) { /* Big arrays, pseudomedian of 9 */ s = (n/8)*es; pl = med3(pl, pl+s, pl+2*s, cmp); pm = med3(pm-s, pm, pm+s, cmp); pn = med3(pn-2*s, pn-s, pn, cmp); } pm = med3(pl, pm, pn, cmp); /* Mid-size, med of 3 */ } Results of experiments on a modified program 2: Behaved well on non-random inputs On random input arrays C n ≈ 1.362 n log 2 n − 1.41 n Royer (CIS 351) Engineering a Sort Function February 4, 2019 11 / 16

  12. Take advantage of repeated elements void iqsort2(int *x, int n) During partitioning { int a, b, c, d, l, h, s, v; = < ? > = if (n <= 1) return; v = x[rand() % n]; a b c d a = b = 0; c = d = n-1; After partitioning for (;;) { while (b <= c && x[b] <= v) { if (x[b] == v) iswap(a++, b, x); = < > = b++; } a c b d while (c >= b && x[c] >= v) { After copying if (x[c] == v) iswap(d--, c, x); c--; < > = } if (b > c) break; iswap(b++, c--, x); } s = min(a, b-a); for(l = 0, h = b-s; s; s--) iswap(l++, h++, x); s = min(d-c, n-1-d); for(l = b, h = n-s; s; s--) iswap(l++, h++, x); iqsort2(x, b-a); iqsort2(x + n-(d-c), d-c); } Program 6. An integer qsort with split-end partitioning Quicksort with split-end partitioning (Program 7) is about twice as fast as the Royer (CIS 351) Engineering a Sort Function February 4, 2019 12 / 16

  13. Other Tricks Used A more efficient swap macro. On small arrays ( n < 7), use insertion sort in place of quicksort. Avoid recursive calls when n = 1. Added a vecswap function to do the copying of the equal elements to the middle. Tuned the constants via testing and measurement. Not Used Replaced the recursions with stacks. A fancier insertion sort. Sentinels on the array ends. Fancier sampling schemes. Special tuning for different machines/compilers. Royer (CIS 351) Engineering a Sort Function February 4, 2019 13 / 16

  14. Final code, part 1 void qsort(char *a, size_t n, size_t es, int (*cmp)()) { char *pa, *pb, *pc, *pd, *pl, *pm, *pn, *pv; int r, swaptype; WORD t, v; size_t s; SWAPINIT(a, es); if (n < 7) { /* Insertion sort on smallest arrays */ for (pm = a + es; pm < a + n*es; pm += es) for (pl = pm; pl > a && cmp(pl-es, pl) > 0; pl -= es) swap(pl, pl-es); return; } pm = a + (n/2)*es; /* Small arrays, middle element */ if (n > 7) { pl = a; pn = a + (n-1)*es; if (n > 40) { /* Big arrays, pseudomedian of 9 */ s = (n/8)*es; pl = med3(pl, pl+s, pl+2*s, cmp); pm = med3(pm-s, pm, pm+s, cmp); pn = med3(pn-2*s, pn-s, pn, cmp); } pm = med3(pl, pm, pn, cmp); /* Mid-size, med of 3 */ } PVINIT(pv, pm); /* pv points to partition value */ pa = pb = a; Royer (CIS 351) Engineering a Sort Function February 4, 2019 14 / 16

  15. Final code, part 2 } PVINIT(pv, pm); /* pv points to partition value */ pa = pb = a; pc = pd = a + (n-1)*es; for (;;) { while (pb <= pc && (r = cmp(pb, pv)) <= 0) { if (r == 0) { swap(pa, pb); pa += es; } pb += es; } while (pc >= pb && (r = cmp(pc, pv)) >= 0) { if (r == 0) { swap(pc, pd); pd -= es; } pc -= es; } if (pb > pc) break; swap(pb, pc); pb += es; pc -= es; } pn = a + n*es; s = min(pa-a, pb-pa ); vecswap(a, pb-s, s); s = min(pd-pc, pn-pd-es); vecswap(pb, pn-s, s); if ((s = pb-pa) > es) qsort(a, s/es, es, cmp); if ((s = pd-pc) > es) qsort(pn-s, s/es, es, cmp); } Royer (CIS 351) Engineering a Sort Function February 4, 2019 15 / 16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend