Statistics 540: Sorting

Announcement

Easy scheme: shuffle and sort (do with cards)
- Complexity: exponential random time infinite worst case time
Bubble sort
- n² worst case time
- n² average case time
Selection sort
- n² worst case time
- n² average case time
Quick sort (an ACM algorithm of the century)
- best case: n log(n)
- worst cast: n²
- Which is more accurate?
- average: n log(n) also! (Provable ONLY if you actually randomize)
Merge sort
- best case: n log(n)
- worst cast: n log(n)
- But SLOWER than quick sort? Now we are into empirical checking
Provable lower bound
- Each "if" statement can divide the class of all permutations into two bins
- There are n! bins to start with
- Thus it takes log(n!) yes-no questions to identify what the starting position is.
- So at least n log(n) if-then-else statements, plus other statements
Hash sort with merge sort in each bin
- best expected case: O(n)
- Yikes! better than the lower bound?
- Notice a "hash" into n cells is the same as log(n) yes-no decisions
- worst case n log(n)
Which to use? (see perl code)

Last modified: Tue Oct 30 08:26:14 2001