Close
Register
Close Window

BM40A1500 Data Structures and Algorithms

Chapter 4 Recursion and Binary Trees

Show Source |    | About   «  4.11. Heaps and Priority Queues   ::   Contents   ::   5.1. Algorithm Design Principles  »

4.12. Heapsort

4.12.1. Heapsort

A good sorting algorithm can be devised based on a tree structure more suited to the purpose. In particular, we would like the tree to be balanced, space efficient, and fast. The algorithm should take advantage of the fact that sorting is a special-purpose application in that all of the values to be stored are available at the start. This means that we do not necessarily need to insert one value at a time into the tree structure.

Heapsort is based on the heap data structure. Heapsort has all of the advantages just listed. The complete binary tree is balanced, its array representation is space efficient, and we can load all values into the tree at once, taking advantage of the efficient buildheap function. The asymptotic performance of Heapsort when all of the records have unique key values is \(\Theta(n \log n)\) in the best, average, and worst cases. .. It is not as fast as Quicksort in the average case (by a constant .. factor), but Heapsort has special properties that will make it .. particularly useful for .. external sorting algorithms, .. used when sorting data sets too large to fit in main memory.

Settings

Proficient Saving... Error Saving
Server Error
Resubmit

A complete implementation is as follows.

def heapsort(A):
    # The heap constructor invokes the buildheap method
    H = MaxHeap(A)
    # Now sort
    for i in range(len(A)):
        H.removemax(); # Removemax places max at end of heap

Later in the course, you will learn sorting algorithms that are a bit faster than Heapsort in the average case. These include, for example, Quicksort that is faster than Heapsort by a constant factor (technically both are still Theta(n log n)). However, Heapsort has one special advantage over the other commonly used sorting algorithms. Building the heap is relatively cheap, requiring Theta(n) time. Removing the maximum-valued record from the heap requires Theta(log n) time in the worst case. Thus, if we wish to find the k records with the largest key values in an array, we can do so in time Theta(n + k log n). If k is small, this is a substantial improvement over the time required to find the k largest-valued records using one of the other sorting methods described earlier (many of which would require sorting all of the array first).

Another special case arises when all of the records being sorted have the same key value. This represents the best case for Heapsort. This is because removing the smallest value requires only constant time, since the value swapped to the top is never pushed down the heap.

   «  4.11. Heaps and Priority Queues   ::   Contents   ::   5.1. Algorithm Design Principles  »

nsf
Close Window