Introduction
In many practical situations we must arrange or process data quickly so that results become usable without delay. Sorting is a fundamental operation used to order data so that other algorithms (searching, merging, statistical computations) run efficiently. Different sorting algorithms provide different trade-offs between speed, memory usage and implementation simplicity. This chapter explains the notion of complexity and examines the time complexities of common sorting algorithms, emphasising clear definitions, examples and the practical implications of each complexity.
Complexity
Complexity describes how the resources required by an algorithm grow with the size of the input. In algorithms we usually measure two resources:
- Time complexity: an estimate of the number of elementary operations (or steps) the algorithm performs as a function of the input size n. Time complexity is expressed using asymptotic notation so that details of hardware, compiler or constant factors do not obscure the growth rate.
- Space complexity: the total additional memory the algorithm requires in terms of n, excluding the input itself unless specified.
Asymptotic notation commonly used is:
- Big-O (O): an upper bound on growth rate. If f(n) = O(g(n)) then for sufficiently large n, f(n) ≤ c·g(n) for some constant c.
- Big-Theta (Θ): a tight bound. If f(n) = Θ(g(n)) then f(n) grows at the same rate as g(n) up to constant factors.
- Big-Omega (Ω): a lower bound. If f(n) = Ω(g(n)) then for sufficiently large n, f(n) ≥ c·g(n) for some constant c.
Common growth classes and examples
- O(1): constant time. The running time does not depend on input size. Example: accessing an element by index in an array.
- O(log n): logarithmic time. Example: binary search on a sorted array or balanced binary search tree operations (e.g., search in AVL or red-black tree).
- O(n): linear time. Running time grows proportionally to n. Example: linear scan (finding maximum) or linear search in an unsorted array.
- O(n log n): n log n time. Typical of efficient comparison-based sorting algorithms such as merge sort and average-case quicksort.
- O(n2): quadratic time. Running time grows as the square of n. Typical for simple sorting algorithms that use nested loops, such as bubble sort, selection sort and insertion sort (average/worst cases).
Time complexity of common sorting algorithms
Sorting algorithms are often categorised by whether they are comparison-based and by their algorithmic strategy (e.g., divide and conquer, simple exchange, selection). For each algorithm we list time complexities for best, average and worst cases, and note properties that affect practical choice (stability, in-place, typical space usage).
Bubble sort
Bubble sort repeatedly compares adjacent elements and swaps them if they are in the wrong order. After each pass the largest (or smallest) remaining element reaches its correct position.
- Best case: O(n) - when the array is already sorted and an optimisation checks for no swaps in a pass.
- Average case: O(n2)
- Worst case: O(n2)
- Properties: stable, in-place, simple to implement but inefficient for large n.
Selection sort
Selection sort finds the minimum (or maximum) element from the unsorted prefix and places it at the next position in the sorted prefix. It repeats this process n times.
- Best case: O(n2)
- Average case: O(n2)
- Worst case: O(n2)
- Properties: in-place, not stable by default, performs O(n) swaps which can be advantageous when writes are expensive.
Insertion sort
Insertion sort builds the sorted array one element at a time by inserting each new element into its correct position among previously sorted elements.
- Best case: O(n) - when the array is already nearly sorted; only one comparison per element.
- Average case: O(n2)
- Worst case: O(n2)
- Properties: stable, in-place, very efficient for small n or nearly sorted data and often used as the base case in recursive sorts.
Quick sort
Quick sort uses divide and conquer: choose a pivot, partition the array so elements less than pivot come before it and greater elements come after it, then recursively sort the partitions.
- Best case: O(n log n) - when partitions are balanced.
- Average case: O(n log n)
- Worst case: O(n2) - occurs when partitions are extremely unbalanced, e.g., choosing the smallest or largest element as pivot on already sorted data.
- Properties: typically in-place (using partitioning), not stable, average space complexity O(log n) due to recursion depth. Using a random pivot or median-of-three reduces probability of worst case.
Merge sort
Merge sort also uses divide and conquer: split the array into halves, recursively sort each half, then merge the two sorted halves into a single sorted array.
- Best case: O(n log n)
- Average case: O(n log n)
- Worst case: O(n log n)
- Properties: stable (when implemented carefully), not in-place in its standard form because merging requires O(n) extra space; stable merging variants and in-place merges exist but are more complex.
Heap sort
Heap sort builds a binary heap from the array, then repeatedly extracts the maximum (or minimum) and places it at the end of the array to produce a sorted sequence.
- Best, average and worst cases: O(n log n)
- Properties: in-place (requires only O(1) extra space), not stable by default, good worst-case time; typically slower than well-implemented quicksort on average due to cache behaviour.
Non-comparison sorts: Counting, Radix and Bucket
When keys are integers in a limited range or have fixed-length representations, non-comparison sorts can be faster than O(n log n).
- Counting sort: O(n + k) time, where k is the range of key values; stable, not comparison-based, requires O(k) extra space.
- Radix sort: O(d·(n + b)) or simplified to O(n·d) where d is number of digits and b the base; useful for fixed-length keys and often implemented with counting sort as a stable subroutine.
- Bucket sort: expected O(n + k) when input is uniformly distributed and k is number of buckets; performance depends on distribution and bucket implementation.
Practical comparison and selection criteria
Choosing a sorting algorithm depends on several factors. The following points help decide which algorithm to use in a given situation.
- Input size: For small n, simple algorithms (insertion sort) can be faster due to low overhead.
- Data distribution: Nearly sorted data benefits insertion sort and some adaptive algorithms; certain data ranges permit counting or radix sort.
- Stability requirement: If equal keys must preserve original order, choose a stable algorithm (merge sort, insertion sort, counting sort when stable).
- Memory constraints: If extra space is limited, prefer in-place sorts (quick sort, heap sort, selection sort).
- Worst-case guarantees: If a guaranteed upper bound is required, prefer algorithms with O(n log n) worst case such as merge sort or heap sort.
- Parallelism and external sorting: Merge sort is easy to parallelise and to adapt for external sorting (large data on disks) because of its sequential merges.
Examples and small illustrations
- To make the behaviour concrete, consider a small example array: [5, 2, 9, 1, 5, 6].
- Insertion sort will scan from left to right and insert each new element into its correct place among previously sorted elements; it performs few operations if the array is nearly sorted.
- Quick sort with a good pivot will partition the array into smaller subproblems and sort them recursively; with a bad pivot it may do many unnecessary comparisons and swaps leading to the quadratic worst case.
- Merge sort will split the array repeatedly until subarrays of length 1 are obtained, then merge back in sorted order; its number of comparisons is bounded by O(n log n) for all inputs.
Conclusion
Time complexity is the primary tool for comparing algorithms as input size grows. Understanding best, average and worst cases, together with properties such as stability, in-place behaviour and space requirements, is essential for selecting an appropriate sorting algorithm. There is no single "best" algorithm for all situations: choose according to input characteristics, resource constraints and correctness requirements. Practically, hybrid approaches (for example, quicksort with insertion sort for small subproblems, or timsort used in standard libraries) combine strengths of multiple algorithms to achieve robust performance.
Keywords: sorting, time complexity, space complexity, Big-O, bubble sort, selection sort, insertion sort, quick sort, merge sort, heap sort, counting sort, radix sort, stability, in-place, divide and conquer.