For the final lab assignment in my data structures class we were tasked with comparing some sorting algorithms in Python. The comparisons were a few versions of quicksort with different partitioning functions as well as merge sort. The data sets were both ordered and randomly generated lists of integers of varying sizes.
I am a big fan of using docstrings for functions and modules as well as type hints for variables. I also think documenting a function’s input and output is very helpful. Some elements can be so complicated that this can help a user to understand what is going on.
def verify_sorted(array: list, comparison_count: int, swap_count: int) -> None:
"""This function verifies that the array is sorted in ascending order.
Input: array, comparison_count, swap_count
Output: None"""
sorted = True
for i in range(len(array) - 1):
if array[i] > array[i + 1]:
sorted = False
break
else:
sorted = True
if sorted:
print("Comparison count: ", comparison_count)
print("Swap count: ", swap_count)
print()
else:
print('Error: the array is not sorted.')
This is one of the first functions I wrote for the assignment. I learned from a friend how important test-driven development is and wanted to try it out. This function is based on a red-green-refactor setup. I wrote this function to verify that the array was sorted and also print the number of comparisons and swaps that were made within each sorting algorithm.
Here is an interesting graph based on the algorithms vs the random data sets:
I also did a timing comparison of sorting the whole directory of input files on different systems. I used the timeit module in Python to do this. I ran the timing comparisons on my Linux machine, my Apple Macbook Pro 2018, and my NVIDIA Jetson Nano.
The specs for each machine are as follows:
- NZXT Linux Machine:
- Intel® Core™ i9-10900K CPU @ 3.70GHz × 20
- 32 GB RAM
- Apple Macbook Pro 2018:
- 2.2 GHz Quad-Core Intel Core i7
- 16 GB RAM
- NVIDIA Jetson Nano:
- 4 GB RAM
- Clock speed: 1.43 GHz
Since Python is mostly single core, I have a feeling that the processor clock speed is the main factor in the comparisons.