Sorting Algorithms

Important terms:

Comparison-based Sorting: Sort elements by comparing them with each other.
Non-comparison-based Sorting: Sorts do NOT compare elements with one another.

Selection Sort

Is a comparison-based sorting algorithm.

You basically iterate through the array and find the smallest/largest element. If the smallest/largest element is found, exchange it with the first element. Keep doing this until you reach at the end of list.

TLDR; Find smallest element and exchange it with element in the first position; find the second smallest element and exchange it in the second position, etc. It’s a find and replace algorithm.

Pseudo Code:

def selectionSort( lst ):
  for i, e in enumerate(lst):
    minElem = min( range(i, len(lst)), key=list.__getitem__ )
    lst[i], lst[minElem] = lst[minElem], e

  return lst

Best Case: $O(n^2)$ (if input array is already sorted)
Average Case: $O(n^2)$
Worst Case: $O(n^2)$

Insertion Sort

Is a comparison-based sorting algorithm that moves elements one at a time into the correct position.

It is useful for small datasets or partially sorted datasets. It is stable–it maintains the relative order of equal elements (ex: won’t overwrite the original order of the dataset).

Steps:

You start sorting from the second element (you would assume the first is already sorted).
The current element will be compared to the preceding elements in the array. If the current element is less than the one it is being compared to, shift the current element to the left.
Repeat step 2 for each element in the array.
Once the final element is sorted, the entire array will be sorted.

Pseudo Code:

Best Case: $O(n)$ (if input array is already sorted)
Average Case: $O(n^2)$
Worst Case: $O(n^2)$

Mergesort

A divide-and-conquer algorithm.

It is useful for large datasets.

The premise of this sorting algorithm is splitting the list into two halves, recursively sort each half, and then merge the two sorted sublists together.

Pseudo Code:

def mergeSort( lst ):
  if len( lst ) <= 1:
    return m;

  middle = len(m) / 2;
  left = m[:middle];
  right = m[middle:]

  left = merge( left );
  right = merge( right );
  
  return list( merge(left, right) );

Best Case: $O(n)$
Average Case: $O(nlogn)$
Worst Case: $O(nlogn)$

Quicksort

An efficient divide-and-conquer algorithm.

It is useful for large datasets.

Ways to select pivot:

Last element - selecting last element of the array
First element - selecting first element of the array
Random element - selecting a random element inside of array
Median - Picking the median value of array

NOTE: If quicksort chooses either the smallest or largest element as the pivot, quicksort will perform poorly and will give a time complexity of O(n^2). This pivot selection will lead to unbalanced partitions.

Steps:

Select a pivot element. Pointers i and j are made. Pointer i will be initialized to -1 (indicates position before the start of the array); pointer j will be initialized at the beginning of the array (index 0)
Compare each element pointed to by j with the pivot. If element at index j is less than the pivot, increment i and then swap elements found at index i and j.
After pointer j is finished iterating, increment i and swap the pivot with the element found at index i. This will place the pivot in its correct sorted position.
The array will be rearranged so all elements smaller than the pivot come before it; all elements greater than pivot will come after it. After partitioning, the pivot will be in the last position.
Recursively apply quicksort to the sub-arrays to the left and right of the pivot.
Once sub-arrays are sorted, the entire array will be sorted.

TLDR; -> Choose an appropriate pivot -> Divide elements (except for pivot) into two partitions (first partition will have elements less than pivot; second parition will have elements greater than pivot). -> Apply quicksort on both partitions -> After paritions have been sorted, join the pivot and partitions together.

Pseudo Code:

Best Case: $O(nlogn)$ pivot divides array into two equal halves.
Average Case: $O(nlogn)$ pivot is not perfectly in the middle but is reasonably positioned.
Worst Case: $O(n^2)$ pivot ends up being the smallest or largest element in the array.