Jump to content

Selection algorithm

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Dcoetzee (talk | contribs) at 22:26, 25 March 2004 (Initial page with intro, sorting-and-extract approach, min/max algorithms, and brief mention of O(n) worst-case selection algorithm, to expand upon later). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

In computer science, a selection algorithm is a mechanical method of finding the kth smallest number in a list. This includes the simple common cases of finding the minimum, maximum, and median elements.

One simple and widely used selection algorithm is to use a sort algorithm on the list, and then extract the kth element. This is particularly useful when we wish to make many different selections from a single list, in which case only one initial, expensive sort is needed, followed by many cheap extraction operations. When we only need to perform one selection, however, or when we need to strongly mutate the list between selections, this method can be costly, typically requiring at least O(n log n) time, where n is the length of the list.

Worst-case linear algorithms to find minimums or maximums are obvious; we keep two variables, one referring to the index of the minimum/maximum element seen so far, and one holding its value. As we scan through the list, we update these whenever we encounter a more extreme element:

minimum(a)
    minIndex = 1
    minValue = a[1]
    for i = 2 to length(a)
        if a[i] < minValue
            minIndex = i
            minValue = a[i]

maximum(a)
    maxIndex = 1
    maxValue = a[1]
    for i = 2 to length(a)
        if a[i] > maxValue
            maxIndex = i
            maxValue = a[i]

Note that there may be multiple minimum or maximum elements. Because the comparisons above are strict, these algorithms finds the minimum element with minimum index. By using non-strict comparisons (<= and >=) instead, we would find the minimum element with maximum index; this approach also has some small performance benefits.

Finding a worst-case linear algorithm for the general case of selecting the kth largest element is a much more difficult problem, but one does exist, and was published by Blum, Floyd, Pratt, Rivest, and Tarjan in their paper Time bounds for selection. It uses concepts based on those used in the Quicksort sort algorithm, along with an innovation of its own.

Describe algorithm here later