Python Sorting Algorithms

Python Sorting Algorithms

in

For the final lab assignment in my data structures class we were tasked with comparing some sorting algorithms in Python. The comparisons were a few versions of quicksort with different partitioning functions as well as merge sort. The data sets were both ordered and randomly generated lists of integers of varying sizes.

I am a big fan of using docstrings for functions and modules as well as type hints for variables. I also think documenting a function’s input and output is very helpful. Some elements can be so complicated that this can help a user to understand what is going on.

def verify_sorted(array: list, comparison_count: int, swap_count: int) -> None:
  """This function verifies that the array is sorted in ascending order. 
  Input: array, comparison_count, swap_count
  Output: None"""

  sorted = True

  for i in range(len(array) - 1):
    if array[i] > array[i + 1]:
      sorted = False
      break
    else:
      sorted = True

  if sorted:
    print("Comparison count: ", comparison_count)
    print("Swap count: ", swap_count)
    print()
    
  else:
    print('Error: the array is not sorted.')

This is one of the first functions I wrote for the assignment. I learned from a friend how important test-driven development is and wanted to try it out. This function is based on a red-green-refactor setup. I wrote this function to verify that the array was sorted and also print the number of comparisons and swaps that were made within each sorting algorithm.

Here is an interesting graph based on the algorithms vs the random data sets:

Comparisons

I also did a timing comparison of sorting the whole directory of input files on different systems. I used the timeit module in Python to do this. I ran the timing comparisons on my Linux machine, my Apple Macbook Pro 2018, and my NVIDIA Jetson Nano.

The specs for each machine are as follows:

  • NZXT Linux Machine:
    • Intel® Core™ i9-10900K CPU @ 3.70GHz × 20
    • 32 GB RAM
  • Apple Macbook Pro 2018:
    • 2.2 GHz Quad-Core Intel Core i7
    • 16 GB RAM
  • NVIDIA Jetson Nano:
    • 4 GB RAM
    • Clock speed: 1.43 GHz

Since Python is mostly single core, I have a feeling that the processor clock speed is the main factor in the comparisons.

Excel