In many coding interviews and real-world applications, finding the top ( k ) largest elements in an array is a common problem. This tutorial will guide you through three popular methods to solve this problem: Sorting, Min-Heap, and the Quick select algorithm. We’ll focus on the Min-Heap approach due to its efficiency and practical use in interviews.
Problem Statement
Given an array of integers and an integer ( k ), find the top ( k ) largest elements in the array.
Method 1: Sorting
Description: The simplest way to solve this problem is to sort the array and then select the top ( k ) elements.
Steps:
- Sort the array in descending order.
- Return the first ( k ) elements.
Time Complexity: ( O(n \log n) )
Python Implementation:
def top_k_largest_elements_sort(arr, k):
return sorted(arr, reverse=True)[:k]
# Example usage:
array = [3, 2, 1, 5, 6, 4]
k = 2
print(top_k_largest_elements_sort(array, k)) # Output: [6, 5]
Method 2: Min-Heap
Description: A more efficient way is to use a min-heap of size ( k ). The min-heap keeps track of the top ( k ) largest elements as we iterate through the array.
Steps:
- Initialize an empty min-heap.
- Iterate through each element in the array:
- If the heap size is less than ( k ), push the element into the heap.
- Otherwise, compare the element with the smallest element in the heap (the root of the heap). If the element is larger, replace the smallest element with the current element.
- At the end, the heap contains the top ( k ) largest elements. Sort the heap in descending order to get the final result.
Time Complexity: ( O(n \log k) )
Python Implementation:
import heapq
def top_k_largest_elements(arr, k):
if k <= 0:
return []
min_heap = []
for num in arr:
if len(min_heap) < k:
heapq.heappush(min_heap, num)
else:
if num > min_heap[0]:
heapq.heapreplace(min_heap, num)
return sorted(min_heap, reverse=True)
# Example usage:
array = [3, 2, 1, 5, 6, 4]
k = 2
print(top_k_largest_elements(array, k)) # Output: [6, 5]
Method 3: Quickselect Algorithm
Description: Quickselect is a selection algorithm to find the ( k )th largest element in an unordered list. It works by partially sorting the array.
Steps:
- Choose a pivot element.
- Partition the array into two halves: elements less than the pivot and elements greater than the pivot.
- Recursively apply the same process to the half that contains the ( k )th largest element.
Time Complexity:
- Average: ( O(n) )
- Worst-case: ( O(n^2) ) (though rare)
Python Implementation:
import random
def quickselect(arr, k):
if not arr:
return []
pivot = random.choice(arr)
lows = [el for el in arr if el < pivot]
highs = [el for el in arr if el > pivot]
pivots = [el for el in arr if el == pivot]
if k < len(highs):
return quickselect(highs, k)
elif k < len(highs) + len(pivots):
return pivots[0]
else:
return quickselect(lows, k - len(highs) - len(pivots))
def top_k_largest_elements_quickselect(arr, k):
n = len(arr)
result = []
for i in range(n - k, n):
result.append(quickselect(arr, i))
return sorted(result, reverse=True)
# Example usage:
array = [3, 2, 1, 5, 6, 4]
k = 2
print(top_k_largest_elements_quickselect(array, k)) # Output: [6, 5]
Conclusion
Among the three methods, using a Min-Heap is often the most efficient and straightforward approach, especially for large datasets and when ( k ) is much smaller than ( n ). It provides a good balance between simplicity and performance, making it an excellent choice for coding interviews.
Example Usage of Min-Heap Approach
Here’s a practical example to demonstrate how to use the Min-Heap approach:
array = [3, 2, 1, 5, 6, 4]
k = 2
print(top_k_largest_elements(array, k)) # Output: [6, 5]
In this example, the array [3, 2, 1, 5, 6, 4] is processed to find the top 2 largest elements. The Min-Heap method efficiently finds [6, 5] as the top 2 largest elements.
By mastering these methods, you’ll be well-equipped to tackle similar problems in coding interviews and practical applications.