Lecture 7 - QuickSort

4. QuickSort (Section 5.4.5)

The idea behind QuickSort is to replace the MERGE operation in step 3 with the very much faster JOIN operation. If this could be done, while still splitting the list more or less in half in step (1), we would have an algorithm that is clearly faster than Merge Sort. How can we change the way we do step (1) - SPLIT L into pieces - so that we can use JOIN instead of MERGE? This is the secret to QuickSort.

Recall: when is it safe to JOIN two sorted lists L1 and L2?

Answer: When everything in L1 is smaller than everything in L2.

So that's how we cut our list in two. Pick a number, CUTOFF, and put all the elements less than CUTOFF into L1 and all the ones bigger than CUTOFF into L2.

How shall we choose the CUTOFF value? Does it matter?

Answer: We must keep 3 things in mind when we choose CUTOFF:

its value must be fairly cheap to compute.

It must not be bigger than, or smaller than, all the values in the given list. Why? If it were, then one of the pieces would be identical to L and we'd be in an infinite loop.

Ideally, roughly half the values in L will be smaller than CUTOFF. Cutting-in-half will make QuickSort very efficient.

Considering (2) and (3), the very best choice for CUTOFF is the median value in the given list. By definition the median of a list is the value in the list which is larger than exactly half the values in the list.

Let's illustrate quicksort with the list (above) [56,29,35,42,15,41,75,21], assuming that the median is chosen for splitting. At this point I'll also add one little wrinkle - QuickSort actually splits the list into three pieces:

L1 = {elements with keys strictly less than CUTOFF}
LC = {elements with keys equal to CUTOFF}
L2 = {elements with keys strictly greater than CUTOFF}

L1 and L2 are sorted recursively. Then we join L2 to LC and then join the result to L1.

L = [56,29,35,42,15,41,75,21]
LC = [35]
L1 = [15,21,29] which happens to be sorted by a fluke.
L2 = [56,42,41,75]

Recursively sort L1:

L = [15,21,29]
LC = [21]
L1 = [15]. This is a singleton, so next recursive call will return it unchanged.
L2 = [29] (ditto).
JOIN: L1' LC L2' = [15,21,29]

We obtain L1' = [15,21,29].

Recursively sort L2.

L = [56,42,41,75]
LC = [42]
L1 = [41]. This is a singleton, so next recursive call will return it unchanged.
L2 = [56,75]. Recursively sort L2.
- L = [56,75]
- LC = [56]
- L1 = [], so next recursive call will return it unchanged.
- L2 = [75], singleton, recursive call won't change it.
- JOIN: L1' LC L2' = [56,75]
L2' = [56,75]
JOIN: L1' LC L2' = [41,42,56,75]

We obtain L2' = [41,42,56,75]

JOIN L1' LC L2' = [15,21,29, 35 ,41,42,56,75]

Quicksort is very efficient if you use the median as the CUTOFF value, because the list is always split into two equal size pieces. But now we must consider, how efficiently can we compute the median of a list? There is no obvious way to do this quickly - in fact, all the obvious methods require you to sort the list! This obviously is no good - we are trying to use the median to do the sorting, so we can't sort the list in order to compute the median! There are some complex algorithms that do better than this, but in practice people do not use the median, they use something else that is easier to compute:

The first element in the list.
The mean (or average).
The ``median of 3''. The way this is computed is to extract three elements from the list and use the middle of these 3 values as the CUTOFF.

What can go wrong? It can happen with these techniques that one of the pieces is empty - the CUTOFF value is removed from the list, and the other piece contains all the other values. For example, suppose you were unlucky and always used the smallest element in the list as the CUTOFF. L1 has all the elements smaller than this - there are none! In this case, called the worst case, QuickSort is about the same efficiency as Insertion Sort.

QuickSort is usually used with arrays. In this case, there is a special version of QuickSort that sorts the array in place i.e. without using any extra memory (normally you would create L1 and L2 by copying the values out of L into them, so you'd need enough space for two whole copies of L). This is done by a complex method of shuffling around the values within the array: the basic idea is not too difficult to understand, but the code itself is very tricky - if you're interested see pages 178-179 in the text.