The first step to solving Leetcode problems in interviews is to know commonly used data structures and algorithms by heart - being able to write a bug-free DFS implementation without having to think about it for example. This frees up mental capacity to focus on the problem, and figuring out which preconditions and desired outputs of the algorithms you know are best for that problem.
This cheatsheet helps by providing reference implementations to memorize.
Two Pointers
Preconditions: one or more arrays or linked lists
Outcome: finding a set of elements that meet a certain criteria (pair / triplet / subarray), iterating over parts of the array/list simultaneously or iterating over array/string from both ends, or modifying the array
def is_palindrome(s):
if not s:
return True
left, right = 0, len(s) - 1
while left <= right:
if s[left] != s[right]:
return False
left += 1
right -= 1
return TrueSliding Window
Preconditions: array, string or linked list, expanding/contracting a continuous window of elements
Outcome: calculate something based on contiguous subarrays or sublists of a given size k, k may vary depending on problem constraints e.g, optimization or finding length
def longest_substring_with_k_distinct(s, k):
window_start = 0
max_length = 0
char_frequency = {}
for window_end in range(len(s)):
right_char = s[window_end]
char_frequency[right_char] = char_frequency.get(right_char, 0) + 1
# Shrink window until we have at most k distinct characters
while len(char_frequency) > k:
left_char = s[window_start]
char_frequency[left_char] -= 1
if char_frequency[left_char] == 0:
del char_frequency[left_char]
window_start += 1
max_length = max(max_length, window_end - window_start + 1)
return max_lengthFast and slow pointers
Precondition: sequence to iterate over, fast and slow pointers eventually meet, or fast pointer reaches the end
Output: cycle finding and start of a cycle in a linked list, finding middle of a sequence or the nth element from the end
# Cycle detection in a linked list
def has_cycle(head):
if not head or not head.next:
return False
slow, fast = head, head
while fast and fast.next:
slow = slow.next # Move one step
fast = fast.next.next # Move two steps
if slow == fast: # Found cycle
return True
return False # No cycle found# Finding middle of a linked list
def find_middle(head):
if not head:
return None
slow = fast = head
while fast and fast.next:
slow = slow.next # Move one step
fast = fast.next.next # Move two steps
return slow # When fast reaches end, slow is at middleBinary Search
- Time complexity:
O(log n) - Space complexity:
O(1)
This is a standard binary search that returns the index of a specific element in a sorted array, and -1 if it is not found.
def binary_search(arr, target):
l, r = 0, len(arr) - 1
while l <= r:
# using (l + r) // 2 could cause an overflow
m = l + (r - l) // 2
if arr[m] < target:
l = m+1
elif arr[m] == target:
return m
else:
r = m - 1
return -1However, most binary search problems are more complex. For example, finding the smallest number greater than or equal to the target. This requires a more flexible template.
Precondition: there is a predicate function p that satisfies a monotonic condition; the search space can be divided into two regions such that for all elements in one region, it returns false and for all elements in the other region, it returns true.
Outcome: the index of the minimum element that satisfies the predicate
def lower_bound_binary_search(array) -> int:
left, right = min(search_space), max(search_space) # could be [0, n], [1, n] etc. Depends on problem
while left < right:
mid = left + (right - left) // 2
if p(mid):
right = mid
else:
left = mid + 1
return leftThe template above uses a half-open interval pattern [left, right) (inclusive left, exclusive right). This means:
- The algorithm never accesses the element at index
rightduring iteration - When
left == right, the loop terminates, and we returnleftas our answer
This pattern provides several advantages:
- Clean handling of edge cases: Can initialize
right = len(array)without risk of out-of-bounds access. This is appropriate for problems like finding the insertion position for a sorted array, where the insertion position is at the end - Consistent return value:
leftalways points to our answer when the loop exits
To use this template:
- Correctly initialize
leftandright(i.e, the search space), ensuring our target is within that range - The
leftindex returned is the smallest index of an element that satisfies p. Important: may have to verify that this is an element that satisfies our search condition before returning - Design the predicate
p, this is the hardest part
Graph Algorithms
Let v be the number of vertices in the graph, and e the number of edges.
Here, graph is an adjacency list. For example: {0: [2, 3], 1: 2} means node 0 is connected to 2 and 3, node 1 is connected to 2.
A tree is a graph of n nodes without cycles and has n-1 edges.
Breadth-first search (BFS)
- Time complexity:
O(v + e) - Space complexity:
O(v)
Precondition: Need to explore all neighbors before moving outward, solution is near the starting node
Outcome: traversal starting from the source node in order of distance, level order of a tree, shortest path in an unweighted graph
from collections import deque
def bfs(graph, start):
# start is a single node, but could also be a list of starting sources for a multi-source BFS
q = deque([start])
visited = set([start])
while q:
node = q.popleft()
# process node if needed
for nei in graph[node]:
if nei not in visited:
visited.add(nei)
q.append(nei)Depth-first Search (DFS)
- Time complexity:
O(v + e) - Space complexity:
O(v)
Precondition: graph / tree, want to explore as far as possible along each branch before backtracking and solution is far from the start node, depth of path fits in the recursion stack
Outcome: graph exploration as far as possible down a path before backtracking, can be tweaked for topological sorting and cycle detection
visited = set()
def dfs(graph, node):
visited.add(node)
# process node if needed
for nei in graph[node]:
if nei not in visited:
dfs(graph, nei)There is a deep connection between backtracking and DFS. Backtracking is an application of DFS on the state tree of a problem.
Cycle Finding On Undirected Graphs
DFS’s exploration as far down each path as possible is a natural fit for detecting cycles.
visited = set()
def has_cycle(graph, cur, parent):
if cur in visited:
return True
visited.add(cur)
for nei in graph[cur]:
# in an undirected graph, nodes u and v are connected with two edges: from u to v and v to u
# prevent the back edge (i.e, where we came from) from being falsely reported as a cycle
if nei == parent:
continue
if has_cycle(graph, nei, cur):
return True
return False
# To start the DFS, use a node that doesn't exist for the parent, e.g -1Cycle Finding On Directed Graphs
The definition of a cycle is different from that of undirected graphs, so requires distinguishing between fully processed (visited) nodes, or nodes in the current path being explored.
UNVISITED = 0
EXPLORING=1
VISITED=2
# assuming the graph has n nodes
status = [UNVISITED for _ in range(n)]
def has_cycle(graph, cur):
if status[cur] == EXPLORING:
return True # we found a node that is already on our current path
status[cur] = EXPLORING
for nei in graph[cur]:
if status[nei] != VISITED and has_cycle(graph, nei):
return True
status[cur] = VISITED
return FalseTopological Sort
A topological sort is an ordering of vertices on a directed acyclic graph such that if there is a directed edge from u to v, u appears in the topological ordering before v. This is useful for problems that involve checking whether a set of task dependencies can be satisfied, and in which order the tasks should be done.
There are two algorithms for this.
The first method uses a DFS postorder traversal. By definition, reversing the postorder DFS traversal is a valid topological order.
# this is virtually identical to using DFS for cycle finding, just with a few lines to compute the postorder traversal
UNVISITED = 0
EXPLORING=1
VISITED=2
# returns the topological sort, or [] if the graph has a cycle
def topological_sort(graph):
# assuming the graph has n nodes
status = [UNVISITED for _ in range(n)]
postorder = []
def has_cycle(cur):
if status[cur] == EXPLORING:
return True # we found a node that is already on our current path
status[cur] = EXPLORING
for nei in graph[cur]:
if status[nei] != VISITED and has_cycle(graph, nei):
return True
status[cur] = VISITED
postorder.append(cur)
return False
for vertex in graph.keys():
if status[vertex] == UNVISITED and has_cycle(vertex):
return []
return list(reversed(postorder))The second method is Kahn’s algorithm, which uses BFS to incrementally build the topological order from nodes that have an in-degree of 0 (i.e, they have no dependencies).
from collections import deque
def topological_sort_bfs(graph):
# Calculate in-degree for each node
in_degree = {node: 0 for node in graph}
for node in graph:
for neighbor in graph[node]:
in_degree[neighbor] = in_degree.get(neighbor, 0) + 1
# Start with nodes that have no dependencies (in-degree = 0)
queue = deque([node for node in in_degree if in_degree[node] == 0])
result = []
# Process nodes in topological order
while queue:
node = queue.popleft()
result.append(node)
# Remove this node's influence
for neighbor in graph[node]:
in_degree[neighbor] -= 1
# If a node has no more dependencies, add it to queue
if in_degree[neighbor] == 0:
queue.append(neighbor)
# If result doesn't include all nodes, there's a cycle
if len(result) != len(graph):
return [] # Graph has a cycle
return resultDijkstra’s Shortest Path Algorithm
- Preconditions: weighted graph with non-negative edges, source node
- Output: shortest distance from source to every other node
- Time complexity:
O((V + E) log V)
import heapq
def dijkstra(graph, source):
dist = [float("inf") for _ in range(len(graph))]
dist[source] = 0
pq = [(0, source)]
while pq:
cur_dist, node = heapq.heappop(pq)
if current_dist > dist[node]:
continue
for nei, weight in graph[node].items():
new_dist = current_dist + weight
if new_dist < dist[nei]:
dist[nei] = new_dist
heapq.heappush(pq, (new_dist, nei))
return distUnion Find
This is a relatively obscure data structure, which probably won’t be encountered in interviews. Can still be useful to study if there is time for certain graph problems.
- Time complexity:
O(a(n)), amortized constant time if path compression and union by rank / size are used - Space complexity:
O(n)
Precondition: undirected graph, efficient lookup for which set an element belongs to, efficient union
Outcome: cycle detection, counting components, can be processed incrementally
# assuming there are n nodes / sets at the start
parent = [i for i in range(n+1)]
# this keeps track of the height of the tree
rank = [0] * (n+1)
# find the representative element for an item, i.e which set it belongs to
def find(n):
if parent[n] != n:
parent[n] = find(parent[n]) # path compression
return parent[n]
def union(n1, n2):
n1_root, n2_root = find(n1), find(n2)
if n1_root == n2_root:
return False # both elements are already in the same set
# swap n1 and n2 to ensure that rank of n1 <= n2
if rank[n1_root] > rank[n2_root]:
n1_root, n2_root = n2_root, n1_root
# make smaller set (n1_root) point to larger one
parent[n1_root] = n2_root
# update the ranks, which only need to change if n1 and n2 had equal ranks already
if rank[n1_root] == rank[n2_root]:
rank[n2_root] += 1
return TrueSorting
Mergesort
Precondition: Able to use lots of memory, cannot tolerate a O(n^2) worst case scenario
Outcome: Sorted list
Quicksort
Precondition: can tolerate O(n²) worst case performance for better average performance, in-place sorting, unstable sort
Outcome: Sorted array with typically better cache performance than merge sort
def quicksort(arr, low=None, high=None):
# Initialize default values for first call
if low is None:
low = 0
if high is None:
high = len(arr) - 1
if low < high:
# Partition the array and get pivot index
pivot_index = partition(arr, low, high)
# Recursively sort both halves
quicksort(arr, low, pivot_index - 1)
quicksort(arr, pivot_index + 1, high)
return arr
def partition(arr, low, high):
# Choose rightmost element as pivot
pivot = arr[high]
# Index of smaller element
i = low - 1
for j in range(low, high):
# If current element <= pivot
if arr[j] <= pivot:
# Increment index of smaller element
i += 1
arr[i], arr[j] = arr[j], arr[i]
# Place pivot in correct position
arr[i + 1], arr[high] = arr[high], arr[i + 1]
# Return pivot's index
return i + 1Quickselect
Precondition: Need to find the k-th smallest/largest element in an unsorted array
Outcome: The k-th element without sorting the entire array
- Time complexity: Average O(n), worst case O(n²)
- Space complexity: O(log n) average case for recursion stack
def quickselect(arr, k, low=None, high=None):
"""
Find the k-th smallest element in arr (k is 1-indexed)
"""
# Initialize default values for first call
if low is None:
low = 0
if high is None:
high = len(arr) - 1
# Base case
if low == high:
return arr[low]
# Partition using the same function as quicksort
pivot_index = partition(arr, low, high)
# Determine which side of partition to search
if k - 1 == pivot_index:
# Found the k-th smallest element
return arr[pivot_index]
elif k - 1 < pivot_index:
# The k-th element is in the left subarray
return quickselect(arr, k, low, pivot_index - 1)
else:
# The k-th element is in the right subarray
return quickselect(arr, k, pivot_index + 1, high)
# Note: This uses the exact same partition function as quicksort
def partition(arr, low, high):
pivot = arr[high]
i = low - 1
for j in range(low, high):
if arr[j] <= pivot:
i += 1
arr[i], arr[j] = arr[j], arr[i]
arr[i + 1], arr[high] = arr[high], arr[i + 1]
return i + 1Backtracking
Precondition: Problem can be solved by trying partial solutions and abandoning them when they can’t lead to valid solutions
Outcome: All possible valid solutions to a problem, efficiently solving combinatorial problems (permutations, subsets, combinations)
def backtrack_template(input_data):
result = []
def backtrack(current_state, remaining_choices):
# Base case: reached a solution
if is_solution(current_state):
result.append(current_state.copy()) # Always make a copy!
return
# Try each available choice
for choice in remaining_choices:
# Skip invalid choices
if not is_valid(current_state, choice):
continue
# Make the choice
current_state.append(choice)
# Recursive call
backtrack(current_state, get_next_choices(remaining_choices, choice))
# Undo the choice (backtrack)
current_state.pop()
backtrack([], input_data)
return resultChoose / Don’t Choose Pattern
# returns all possible subsets
def subsets(self, nums: List[int]) -> List[List[int]]:
ans = []
def backtrack(i, cur):
if i == len(nums):
ans.append(list(cur))
return
# don't choose this element
backtrack(i+1, cur)
# choose this element
cur.append(nums[i])
backtrack(i+1, cur)
cur.pop()
backtrack(0, [])
return ansSubset / Combination Pattern
Another way to generate solutions is to explore all solutions that start with the first element, second element, etc. Very useful for subset / combination problems.
def subsets(self, nums: List[int]) -> List[List[int]]:
ans = []
def backtrack(start, cur):
ans.append(list(cur))
for i in range(start, len(nums)):
# if there is a need to remove duplicate solutions:
# if i > start and nums[i] == nums[i-1]:
# continue
cur.append(nums[i])
backtrack(i+1, cur)
cur.pop()
backtrack(0, [])
return ansDynamic Programming
Preconditions:
- Overlapping subproblems: The problem can be broken down into subproblems which are reused multiple times
- Optimal substructure: The optimal solution to the problem can be constructed from optimal solutions of its subproblems
Outcome: Efficient solutions to problems that would require exponential time with naive approaches
Two approaches:
- Top-down (memoization): Recursive with caching
- Bottom-up (tabulation): Iterative building from base cases
# Top-down example: Fibonacci with memoization
memo = {}
def fibonacci_memo(n):
if n in memo:
return memo[n]
if n <= 1:
return n
memo[n] = fibonacci_memo(n-1, memo) + fibonacci_memo(n-2, memo)
return memo[n]# Bottom-up example: Longest Increasing Subsequence
def longest_increasing_subsequence(nums):
if not nums:
return 0
n = len(nums)
# dp[i] = length of LIS ending at index i
dp = [1] * n
for i in range(1, n):
for j in range(i):
if nums[i] > nums[j]:
dp[i] = max(dp[i], dp[j] + 1)
return max(dp)Data Structures
You may have to use multiple data structures simultaneously to solve a problem, for example LRU Cache.
Linked List
Preconditions: has pointer to node to manipulate in the linked list sequence, this could be its head or reference to nodes in the middle of the list
Outcome: O(1) insertion, deletion and removal of elements with a pointer to that node
Heap
Preconditions: interested in repeatedly finding k out of n elements that meets some condition (largest, smallest etc), faster than O(n log n) running time, streaming data
Output: largest / smallest kth element in a stream, the k elements of interest, median
- Time complexity:
O(n)to turn an existing array in-place into a heap,O(log n)for pop / push operations - Space complexity:
O(n)for the elements, the array’s indices implicitly store the heap tree structure
Stack
Preconditions: First in, last out
Outcome: Reverse list of elements
Queue
Precondition: First in, first out
Outcome: first in, first out
Trie
Preconditions: Multiple values that share sequential subvalues, O(n) lookup time
Outcome: Efficient storage of all values, ability to find all values that share a common sequential prefix