You are visitor number
since 29 February 2004
Go to Howard Kaikow's home page
Copyright © 2003-2007 by Howard Kaikow. All rights reserved.
Date: 16 July 2007
Author: Howard Kaikow <kaikow@standards.com>
Subject: Sort Performance Comparison program - General description
The program is intended to facilitate the comparison of several well-known sorting methods so one can choose among the methods and, for Microsoft Office VBA programmers, choose an alternative to Microsoft Word's WordBasic.SortArray, Microsoft Word's document sort, and Microsoft Excel's worksheet sort.
Please note that by making this program available, I am not indicating any intent to recommend algorithms or to discuss the details of any algorithms.
Terminating the program
Selecting sort methods
Number of data items
Sorting algorithms
Data generation
Data types
Source of the unsorted data
Target for the sorted data
I have often wondered why Microsoft did not include a VBA method to sort an array, e.g. why was there not a Word VBA method included to replace WordBasic's SortArray? In newsgroups and web forums, many people have asked questions such as: "How do I sort an array in VBA?". Often the response has been one of the following:
For Microsoft Word ("Word"):
Use the SortArray method of the WordBasic object.
Write, or obtain, a VB/VBA sorting procedure.
Obtain a DLL that includes a sorting procedure.
Copy the array into a Word document and use the Sort method of Word's Range/Selection object, then copy the sorted data into the array.
Start an instance of Microsoft Excel. Copy the array into an Excel worksheet and use the Sort method of Excel's Range object, then copy the sorted data into the array.
For Microsoft Excel ("Excel"):
Copy the array into an Excel worksheet and use the Sort method of Excel's Range object, then copy the sorted data into the array.
Write, or obtain, a VB/VBA sorting procedure.
Obtain a DLL that includes a sorting procedure.
Start an instance of Word. Sort using the SortArray method of the WordBasic object, then copy the sorted data into the array.
Start an instance of Word. Copy the array into a Word document and use the Sort method of Word's Range/Selection object, then copy the sorted data into the array.
In the case of Word, it seemed obvious that using WordBasic.SortArray would be best. After all, the WordBasic.SortArray method is compiled into Word, and surely, Microsoft must have used a reasonable sorting algorithm, so how could one expect to write a sorting procedure in VB/VBA that would outperform WordBasic.SortArray?
Up until a few years ago, I too believed that there was no point in trying to do better than using WordBasic.SortArray, but then I got curious and did some experimentation. I was shocked to learn how badly WordBasic.SortArray performed. It is rather trivial to outdo the performance of WordBasic.SortArray. Along the way, I also did some experimentation with Excel.
I have developed a Microsoft Visual Basic 6 ("VB 6") program to facilitate comparing the performance of various sorting methods.
Download the Sort Performance Comparison Program
Click here to view more information for this program.
Note: It is very important to read the documentation for this program before running the program.
There is no installation necessary, just directly run the program. You should first read the remainder of this document before running the program, especially, Terminating the program and Selecting sort methods.
In order to run this program, you will need to have the following installed:
In order to use the following sort methods, it will be necessary to have the needed Microsoft Excel or Microsoft Word, or both, from Microsoft Office 97, Microsoft Office 2000, Microsoft Office XP, or Microsoft Office 2003.
- SortArray method of the WordBasic object in Word VBA.
- Sort method of the Range/Selection objects in Word VBA.
- Sort method of the Range object in Excel VBA.
The program attempts to detect errors caused by not having the needed Microsoft Excel or Microsoft Word, but this is unsupported use of this program. Version 1.2.1, or later, of the program is needed to try to trap the errors properly.
Windows 2000, or later.
Either Microsoft Visual Basic 6 or the Microsoft Visual Basic 6 run-time files: If you have neither, then you can download the Visual Basic 6 run-time files by following the instructions in VBRun60.exe Installs Visual Basic 6.0 Run-Time Files.
As you must be running Microsoft Windows 2000, or later, you should already have the Visual Basic 6 run-time files installed.
When you run the program, a Visual Basic 6 Form will be displayed that looks something like the following.
The ListBox near the top of the Form lists the current action being performed by the program and provides certain other information.
The Form displays the following controls (not all the controls are displayed for each data type):
Select Sorting Algorithm combo box
Choosing Algorithm Comparison from the combo box presents a list of several sorting algorithms. Choosing any of the other options in the combo box presents a list of one, or more, implementations of the selected sorting algorithm.
Note that the methods listed with Algorithm Comparison are not necessarily the most efficient for each algorithm.
List of sorting methods
Use the check boxes to select, one, or more, sorting methods . Number of data items
Enter the number of data items, the sample size, to be sorted. The program will generate the data.
Order of source data
Choose how to order the data to be sorted.
Use Integer data
If checked, the program uses only integer data.
Maximum integer value Choose the maximum integer value to be used when you have elected to use integer data. Choose data type
Choose the data type.
Case insensitive sort If checked, the program will perform a case insensitive sort for the String data type. Use non-numeric strings If not checked, strings will be numbers. If checked, strings will be data randomly chosen from an internal word list. Display data
If checked, both the unsorted and sorted data will be displayed in ListBoxes for the last selected sort method that was run successfully.
Reuse data
If checked, the data used for the previous set of selected sort methods will be reused, but only if the data type, the number of data items, whether non-numeric, and the order of source data have not been modified.
Number of samples
The number of times the selected sort methods should be run. If Reuse data is checked, the same data will be used for each run.
Select Source
Select the source of the unsorted data . Select Target
Select where to put the sorted data. Set randomize seed If checked, each time you load the program, the same seed will be used to generate the same random number sequence when you first select the Start button Verify sort If the target is an array, the correctness of the sort will be verified. Clear
Unselects all selected sort methods.
Select All
Selects all listed sort methods. Exit
Terminate the program. Start
Run using the selected options.
In order to allow for more accurate timing estimates, the program cannot include code to facilitate your interrupting program execution. The program needs to run to completion and self-terminate. If you have a need to terminate the program, you may do so by using the Windows Task Manager (CTRL + ALT + DEL) to terminate the program.
If the program is allowed to run to completion, the program should clean up after itself. If you interrupt program execution, and you have selected any of the options that utilize Excel or Word, you should also terminate the WinWord and Excel tasks created by the program. The tasks will also vanish upon rebooting.
Some of the sorting algorithms are very efficient, but a few, such as Bubble Sort, are very inefficient.
Click here to view some examples of performance.
The sample size that may be used is limited by:
The amount of available memory on the system.
The amount of available paging file space.
The sample size may not be more than 32 768 with the Visual Basic 6 ListBox Sort method. This restriction is due to the ListCount property of a ListBox control. There are ways to program around this restriction, but I have not yet done so.
Restrictions imposed by Excel or Word:
The sample size may not be more than 32 768 with the SortArray method of the WordBasic object.
The sample size may not be more than 65 536 with the Sort method of Excel's Range object.
Although I have not yet seen this documented, it appears that the sample size may not be more than 16 382 with the Sort method of Word's Range/Selection objects.
The Sort method of Word's Range/Selection objects does not work with a sample size of 1
Certain characteristics of particular sorting algorithms may result in OUT OF STACK SPACE or OUT OF MEMORY errors.
The implementation details of the program. For example, in a real application, one may choose to sort the data in place. However, in order to re-use data with more than one sorting method, the program needs to save the data and make a copy to pass to the sorting code.
The size of the internal word list used by this program.
Other programs that may be running.
The program includes implementations of the following sorting algorithms:
- Bubble sort
- Counting sort
- Heap sort
- Insertion sort
- Merge sort
- Quicksort
- Selection sort
- Shell sort
The program provides a means to compare the performance of the algorithms with each other, with various implementations of the same algorithm, and with the following:
- SortArray method of the WordBasic object in Word VBA.
- Sort method of the Range/Selection objects in Word VBA.
- Visual Basic 6 ListBox with the Sort property set to True.
- Sort method of the Range object in Excel VBA.
If the target of the sort is an array, the program will verify whether the sort was done correctly.
The program will not attempt to verify the correctness of the following as I have yet to see documentation of how those methods sort:
- SortArray method of the WordBasic object in Word VBA.
- Sort method of the Range/Selection objects in Word VBA.
- Visual Basic 6 ListBox with the Sort property set to True.
- Sort method of the Range object in Excel VBA.
The program will generate the data used and allows you to specify the order of the data that is to be sorted.
Random |
The VB Rnd function is used to generate random data. The data will be non-numeric strings if Use non-numeric strings is checked. If Use non-numeric strings is not checked, the data will be numbers. The data will be all integers if Use integer data is checked. |
Partially sorted data |
The data will be consecutive integers starting with 0. The number of integers will be equal to the sample size. The first data item will be swapped with a randomly (using the VB Rnd function) chosen data item. |
Shuffle |
The data will be consecutive integers starting with 0. The number of integers will be equal to the sample size. The data will be shuffled. |
Forward |
The data will be consecutive integers, in ascending order, starting with 0. The number of integers will be equal to the sample size. |
Reverse |
The data will be consecutive integers, in descending order, starting with one less than the sample size. The number of integers will be equal to the sample size. |
Constant |
The data will be a constant value. |
You may choose the maximum integer value to be used when Use integer data is checked.
I have chosen to test the sorting of only the String, Double and Long (32-bit signed integer) data types.
The program assumes that the goal is to sort a vector (one-dimensional array) of data and that the source for the data may be any of the following:
The simulated source should be deleted when the program terminates normally.
The sorted data is not of much use unless the data can be saved in the intended target. The target may be any of the following:
The program creates the data target and copies the sorted data to the desired target.
The simulated target should be deleted when the program terminates normally.
If you choose to run one sample, the result may look something like the following.
You can already see how badly WordBasic.SortArray performs compared with all other methods listed.
If you have elected to run more than one sample, the result may look something like the following.
When running more than one sample, the time (milliseconds) reported is the arithmetic average (truncated to an integer). In addition, the total time (truncated to an integer) and the number of samples run for each sort method are reported.
If you choose to display the data, the result may look something like the following.
The sorted data displayed is from the last sorting method that was run, in this case, quicksort. If the sorting algorithms are correct, the result should have been the same no matter which algorithms were used.
The program will create, or append to, a log file named SortPerformanceLog.txt. The file will be located in the same directory in which you installed the program.
The primary purpose of the log file is to record the options used and results for debugging the program.
If the Verify sort box is checked, upon a failure to verify correctly, a file will be recorded in the same directory as the log file. The file will have a name that reflects the date and time the file was created, e.g., SortData-38006.635775463.txt. In combination with the log file, the SortData file allows for debugging of, if any, sort failures.
You may delete these files after running the program, unless you need to report a problem, in which case, I will need both files sent to me in a manner that I will specify as needed.