You are visitor number ????? since 29 February 2004

Go to Howard Kaikow's home page

Copyright © 2003-2005 by Howard Kaikow. All rights reserved.

Date: 15 July 2005(Revision of 3 September 2004 version of this document)

Author: Howard Kaikow <kaikow@standards.com>

Subject: Sort Performance Comparison program - Information


Purpose of the program

This document provides information about my Sort Performance Comparison Program.

I will use this document to communicate certain information about the program

In addition, you should read the general information document for the program.

Technical details

Screen resolution

Padding

Timer resolution

Timing

Sorting algorithms

Bubble sort

Counting sort

Heap sort

Insertion sort

Merge sort

Quick sort

Selection sort

Shell sort


Technical details

Screen resolution

The VB 6 Forms used by the program were originally designed for version 1.0.0 using a screen resolution of 800 x 600, and have been viewed using resolutions of 1024 x 768 and 1280 x 1024. I am no longer using a resolution of 800 x 600 for this program.

Padding

I had to make some decisions about the type of data to be sorted. I chose to use only the Double, Long (32-bit signed integer) and String data types.

Prior to version 1.2.0, this program used only all numeric data for the String data type. Version 1.2.0 includes an internal list of words from which non-numeric data will be generated. Such data is not padded, or aligned, and is passed to the sort routines as is.

The program still supports using numeric string data which raises the following issues.

I decided to pad, and align, the String data to avoid those problems. Making the strings longer slows down the string sorts, but does not invalidate the comparison among the string sorting methods.

When using numeric string data, certain sorting algorithms are particularly affected.

A VB 6 ListBox has a Sort property, but the sort is only alphanumeric. It is necessary to pad/align numeric data before sending the data to the VB 6 ListBox. The data is padded/aligned when created by the program, and the time to do so is not counted in the time to do the sort. However, Double and Long data need to be padded/aligned during the sort. This imposes a penalty when using the Double or Long data type with the VB 6 ListBox sort method.

The counting sort algorithm will work only with numeric integer data. In order to offer a String option for the counting sort algorithm, I had to create a variation of the algorithm that treats strings as numbers. For all but one of the counting sort algorithms, this requires padding during the sorting process.

Timer resolution

The program uses the QueryPerformanceFrequency API to detect whether a high resolution timer is available. The high resolution timer is used via the QueryPerformanceCounter API. Otherwise, the GetTickCount API is used for timing.

When you start the program, before you choose any options, the ListBox near the top of the Form will indicate which timer is being used.

Timing

There are three components that need to be timed:

  1. Retrieval of the data from the data source

    The program generates the data and then writes the unsorted data to the data source. The program then determines the time needed to retrieve the data from the data source and to save the data in a one-dimensional array of the requested data type. This time is considered to be 0 if the source is an Array.

  2. Performing the sort

    The program sorts a copy of the original unsorted data so the original unsorted data may be reused. Depending on the sample size, the order of the source data, the data type, and the sorting method, the time to do the sort may be from less than one millisecond to, nearly, forever.

  3. Copying the sorted data to the target

    The time needed to copy the data to the target depends on the sort method used. For example, if the target is a worksheet, this timing component is considered to be 0 if the Excel Worksheet Sort is used. Otherwise the sorted data needs to be moved from the Excel worksheet to the selected source and the timing component will not be 0.

Sorting algorithms

There is much published literature addressing sorting algorithms. I am not claiming that any of the implementations of any algorithms in the program are the best for any particular algorithm. In some cases, I intentionally included some inefficiencies. However, even if they are inefficient, they still demonstrate how easy it is to do better than the following:

The source of the description of various sorting algorithms is usually a journal article or a book. When I know, I will cite the source for a particular implementation.

For each algorithm implementation, there are separate variations for each of the following, where xxx identifies the algorithm:

In version 1.2.0, I added variations of xxxS and pxxxS to accommodate non-numeric strings.

The sorting code is divided into a number of modules. The compiled code includes:.

modSortDR

On 29 December 2003, I found David Ring's article "A comparison of sorting algorithms". David has given me permission to use his code so that it can be compared with other implementations of the algorithms of interest. David's code provides the xxxL, pxxxD and pxxxS variations. I created the xxxD and xxxS variations using the same algorithm used by David's xxxL variation. David has sent me an update to the code in his article.

modSortHK

Code that I wrote.

I have removed code from modSortHK that used algorithms included in the other modules. There may be duplication among the other modules.

modSortCRS

The algorithms, and modifications of those algorithms, in this module are based on the algorithm descriptions in "Algorithms in C, Third Edition," by Robert Sedgewick, Addison-Wesley, 1998.

In some code, I use the mechanism for determining the pivot in the article by Jon L. Bentley and M. Douglas McIlroy, Engineering a Sort Function, Software-Practice and Experience, vol. 23(11), 1249-1265 (November 1993).

modSortRS

The algorithms, and modifications of those algorithms, in this module are based on the code presented with the algorithm descriptions in "Ready-to-Run Visual Basic Algorithms", Second Edition, by Rod Stephens, John Wiley, 1998.

The book explicitly states that the code can be used, but the code is only available on the CD-ROM with the book. The code is not available at either the publisher's or author's web site.

The code in the book is used for the xxxL variation. I created the xxxD,xxxS, pxxxD and pxxxS variations using the same algorithm used by the xxxL variation.

modSortKB

The algorithms in this module are based on the code in the Microsoft Knowledge Base article Q169617, dated 29 April 2001. I modified the code as follows:

  • Removed code for testing timing.

  • Renamed procedures

  • The code in the article uses the Variant data type. The code in this module uses a separate procedure for each data type, and adds the pointer variations for each algorithm.

Bubble sort

Implementation

Comments

KB

Based on KB article Q169617.

S6.4

Based on Program 6.4 in Robert Sedgewick's book.

S6.4#

Based on a modification, by Howard Kaikow, of Program 6.4 in Robert Sedgewick's book.

RS

xxxL is in Rod Stephens book. xxxD, xxxS, pxxxD and pxxxS were created by Howard Kaikow based on xxxL.

Counting sort

Pointer variations are not provided for the counting sort algorithm.

Implementation

Comments

HK

Concocted by Howard Kaikow.

S6.17

Based on Program 6.17 in Robert Sedgewick's book.

S6.17#

Based on a modification, by Howard Kaikow, of Program 6.17 in Robert Sedgewick's book.

RS

xxxL is in Rod Stephens book. xxxD and xxxS, pxxxD were created by Howard Kaikow based on xxxL.

Heap sort

Implementation

Comments

RS#

Created by Howard Kaikow by modifying RS in a minor way.

S9.7

Based on Program 9.7 in Robert Sedgewick's book.

DR

xxxL, pxxxD and pxxxS were created by David Ring. xxxD and xxxS were created by Howard Kaikow based on xxxL.

RS

xxxL is in Rod Stephens book. xxxD, xxxS, pxxxD and pxxxS were created by Howard Kaikow based on xxxL.

Insertion sort

Implementation

Comments

HK

Concocted by Howard Kaikow.

S6.3

Based on Program 6.3 in Robert Sedgewick's book.

DR

xxxL, pxxxD and pxxxS were created by David Ring. xxxD and xxxS were created by Howard Kaikow based on xxxL.

RS

xxxL is in Rod Stephens book. xxxD, xxxS, pxxxD and pxxxS were created by Howard Kaikow based on xxxL.

Merge sort

Implementation

Comments

RS#

Created by Howard Kaikow by modifying RS to use Insertion sort, rather than Selection sort, to sort small segments.

S8.3

Based on Program 8.3 in Robert Sedgewick's book.

S8.4

Based on Program 8.4 in Robert Sedgewick's book.

S8.4#

Based on a modification, by Howard Kaikow, of Program 8.4 in Robert Sedgewick's book.

S8.5

Based on Program 8.5 in Robert Sedgewick's book.

DR

xxxL, pxxxD and pxxxS were created by David Ring. xxxD and xxxS were created by Howard Kaikow based on xxxL.

RS

xxxL is in Rod Stephens book. xxxD, xxxS, pxxxD and pxxxS were created by Howard Kaikow based on xxxL.

Quick sort

Implementation

Comments

HK

Concocted by Howard Kaikow.

HK#

Concocted by Howard Kaikow by modifying HK.

HK## Concocted by Howard Kaikow by modifying HK#.
S7.1

Based on Program 7.1 in Robert Sedgewick's book.

S7.1=

Based on Program 7.1 in Robert Sedgewick's book. Some code was moved inline, no algorithmic change.

S7.1#

Based on a modification, by Howard Kaikow, of Program 7.1 in Robert Sedgewick's book. Algorithmic change.

S7.1#=

Based on S7.1#. Some code was moved inline, no algorithmic change from S7.1#.

S7.4

Based on Program 7.4 in Robert Sedgewick's book.

S7.4=

Based on Program 7.4 in Robert Sedgewick's book. Some code was moved inline, no algorithmic change.

S7.4#

Based on a modification, by Howard Kaikow, of Program 7.4 in Robert Sedgewick's book. Algorithmic change.

S7.4#=

Based on S7.4#. Some code was moved inline, no algorithmic change.

S7.5

Based on Program 7.5 in Robert Sedgewick's book.

S7.5#

Based on a modification, by Howard Kaikow, of S7.5. Algorithmic change.

S7.5#M

Based on a modification, by Howard Kaikow, of S7.5#. Algorithmic change.

S7.5#M9

Based on a modification, by Howard Kaikow, of S7.5#. Algorithmic change.

S7.5#M9=

Based on a modification, by Howard Kaikow, of S7.5#M9. Some code was moved inline, no algorithmic change.

RS

xxxL is in Rod Stephens book. xxxD, xxxS, pxxxD and pxxxS were created by Howard Kaikow based on xxxL.

Selection sort

Implementation

Comments

S6.2

Based on Program 6.2 in Robert Sedgewick's book.

S6.2#

Based on a modification, by Howard Kaikow, of Program 6.2 in Robert Sedgewick's book.

DR

xxxL, pxxxD and pxxxS were created by David Ring. xxxD and xxxS were created by Howard Kaikow based on xxxL.

RS

xxxL is in Rod Stephens book. xxxD, xxxS, pxxxD and pxxxS were created by Howard Kaikow based on xxxL.

Shell sort

Implementation

Comments

KB

Based on KB article Q169617.

S6.5

Based on Program 6.5 in Robert Sedgewick's book.

HK

Concocted by Howard Kaikow.

DR

xxxL, pxxxD and pxxxS were created by David Ring. xxxD and xxxS were created by Howard Kaikow based on xxxL.