Longest-Common-Subsequence Problem
https://leetcode.com/problems/longest-common-subsequence/
Definition
Given a sequence , another sequence is a subsequence of if there exists a strictly increasing sequence such that .
Algorithm
- Optimal substructure: the LCS of two sequences contains within it an LCS of prefixes of the two sequences.
- Recursive solution
- If , find LCS of and
- If , find the LCS of and , and that of and . Pick the longer LCS.
Approach 1: Memoization
- Discuss whether
text1[p1]is included in the optimal solution or not. - If it is, we can find the first occurrence of this char in
text2and the problem is reduced. - Time complexity: , as there are possible pairs of strings, and solving each of them can take time, as we’re searching the occurrence of a char in a string of length .
from functools import lru_cache
m, n = len(text1), len(text2)
@lru_cache()
def helper(p1, p2):
# Base case: empty strings
if p1 == m or p2 == n:
return 0
# Option 1: We don't include text1[p1] in the solution.
option_1 = helper(p1 + 1, p2)
# Option 2: We include text1[p1] in the solution, as long as a match for it
# in text2 at or after p2 exists.
first_occurence = text2.find(text1[p1], p2)
if first_occurence != -1:
option_2 = 1 + helper(p1 + 1, first_occurence + 1)
else:
option_2 = 0
# Return the best option.
return max(option_1, option_2)
return helper(0, 0)Approach 2: Improved Memoization
A more intuitive way of structuring the subproblems.
- If the current char is the same, consider them a match. This must be part of the optimal solution.
- If not, either skip a char in
text1or skip a char intext2.
Time complexity becomes , as there are combinations of substrings (or suffixes, more specifically.)
from functools import lru_cache
m, n = len(text1), len(text2)
@lru_cache(maxsize=None)
def memo_solve(p1, p2):
# Base case: If either string is now empty, we can't match up anymore
# characters.
if p1 == m or p2 == n:
return 0
# Recursive case 1.
if text1[p1] == text2[p2]:
return 1 + memo_solve(p1 + 1, p2 + 1)
# Recursive case 2.
else:
return max(memo_solve(p1, p2 + 1), memo_solve(p1 + 1, p2))
return memo_solve(0, 0)Approach 3: Bottom-Up DP
- Starting from the final char on both strings. Loop in reversed order.
- First letter is the same? add
1to length of LCS - First letter not the same? Have to kick a letter out, either from
str1orstr2. Then take the max of sub problems.
# Make a grid of 0's with len(text2) + 1 columns and len(text1) + 1 rows.
dp = [[0] * (len(text2) + 1) for _ in range(len(text1) + 1)]
# Iterate up each column, starting from the last one.
for col in reversed(range(len(text2))):
for row in reversed(range(len(text1))):
# If the corresponding characters for this cell are the same...
if text2[col] == text1[row]:
dp[row][col] = 1 + dp[row + 1][col + 1]
# Otherwise they must be different...
else:
dp[row][col] = max(dp[row + 1][col], dp[row][col + 1])
# The original problem's answer is in dp[0][0]. Return it.
return dp[0][0]Approach 4: Bottom-Up DP with Space Optimization
- Note that we’re only referencing the previously calculated column and the current column that we’re calculating.
if len(text2) < len(text1):
text1, text2 = text2, text1
# The previous column starts with all 0's and like before is 1
# more than the length of the first word.
previous = [0] * (len(text1) + 1)
# Iterate up each column, starting from the last one.
for col in reversed(range(len(text2))):
# Create a new array to represent the current column.
current = [0] * (len(text1) + 1)
for row in reversed(range(len(text1))):
if text2[col] == text1[row]:
current[row] = 1 + previous[row + 1] # previous col is to the right
else:
current[row] = max(previous[row], current[row + 1])
# The current column becomes the previous one.
previous = current
# The original problem's answer is in previous[0]. Return it.
return previous[0]An slightly adjusted version:
if len(text2) < len(text1):
text1, text2 = text2, text1
previous = [0] * (len(text1) + 1)
current = [0] * (len(text1) + 1)
for col in reversed(range(len(text2))):
for row in reversed(range(len(text1))):
if text2[col] == text1[row]:
current[row] = 1 + previous[row + 1]
else:
current[row] = max(previous[row], current[row + 1])
previous, current = current, previous
return previous[0]