What are the five steps for solving any dynamic programming problem?

Definition (state the meaning of dp[i] or dp[i][j] in one sentence), base case, transition/induction rule (how the current state depends on smaller ones), termination (which cell is the answer), and optimization (compress space by spotting which cells the transition actually touches). If you cannot complete step 1, do not write the array — the other four steps have nothing to stand on.

What is the difference between dynamic programming and recursion?

They compute the same recurrence in opposite directions. Plain recursion runs top-down from large to small and stores nothing, so overlapping subproblems get recomputed exponentially. DP runs bottom-up from small to large and stores every subresult in a table. Memoized recursion is the bridge: top-down like recursion, but with a cache like DP. LC 120 Triangle demonstrates all three in one problem.

When should you use greedy instead of dynamic programming?

Use greedy only when each locally optimal choice provably forces the global optimum, so you never need to remember alternative sub-solutions. Jump Game is the canonical case: instead of an O(n^2) DP over every reachable path, you track the single farthest index reachable so far and answer in O(n). If a counterexample can make a locally good choice lose globally, greedy is wrong and you fall back to DP.

Leetcode S11 — Dynamic Programming & Greedy

Q: How do you recognize a two-sequence DP problem?

When the input is two strings or arrays and the answer depends on aligning them, model it as a grid dp[i][j] where i indexes the first sequence and j the second. The transition almost always reads three neighbours — diagonal (dp[i-1][j-1]), up (dp[i-1][j]), and left (dp[i][j-1]). Edit Distance, Distinct Subsequences, and Interleaving String are all this same grid with different transition rules.

Dynamic programming is where interview loops go to die — not because the idea is hard, but because candidates reach for a dp[] array before they can say what dp[i] means. This session rebuilds DP from its actual foundation, which is nothing more exotic than mathematical induction: a base case plus a rule that grows small answers into big ones. We work one problem through the full ladder from brute-force recursion to a one-line bottom-up table (LC 120 Triangle), then sweep the whole family — sequence DP, two-sequence grid DP, string DP — and finish where DP quietly loses to greedy on Jump Game and cut-rope.

⚡ Quick Takeaways

DP is induction with a memo — a base case plus a transition that builds answers from smaller to larger, storing every subresult so nothing is recomputed. Recursion is the same recurrence run top-down without the cache.
Five steps, no exceptions — definition, base case, transition, answer, optimization. Skip step 1 — what does dp[i] mean? — and every later step collapses into guesswork.
The Triangle ladder is the whole course — LC 120 in four solutions (2ⁿ DFS → memoized recursion → O(n²) bottom-up → 1D rolling) shows exactly how recursion becomes DP.
Rolling state is free space — most 1D DP compresses O(n)→O(1) and 2D compresses O(mn)→O(n) the moment you notice which cells the transition actually reads.
Two sequences make a grid — Edit Distance, Distinct Subsequences and Interleaving String are all dp[i][j] over a matrix whose transition reads diagonal / up / left.
Greedy wins when local optima chain — Jump Game drops O(n²) DP to O(n) by tracking the single farthest reachable index instead of remembering every path.

tldr

Never write dp[] until you can state dp[i]'s meaning in one sentence. Solve every DP by induction — base case, transition from smaller subproblems, answer cell — then optimize space by seeing which cells the transition touches. Reach for greedy only when each locally optimal choice provably forces the global optimum; Jump Game and cut-rope are the canonical "greedy quietly wins" problems.

The DP Mindset: Induction with a Memo

The single most useful reframe in this session: dynamic programming is mathematical induction that remembers its work. Induction needs two things — a base case, and a rule that derives step n from steps before it. DP adds one thing on top: it writes each answer into a table so overlapping subproblems are solved once, not exponentially many times. That "store to trade space for time" is the entire trick.

Turn that into a checklist you run on every problem:

Definition — say precisely what dp[i] or dp[i][j] means. This is 80% of the difficulty; the rest is bookkeeping.
Base case / start — the smallest states you can fill directly.
Transition / induction rule — how the current state depends on earlier ones. Does it look back one step, a few steps, or all previous states? That question shapes the inner loop.
Termination / answer — which cell (or the max/min over cells) is the final answer.
Optimization — compress space once the dependency pattern is visible.

DP and recursion are the same recurrence run in opposite directions, and interviewers love to hear you name the distinction:

	Direction	Stores results?	Overlap cost
Plain recursion	top-down (large → small)	no	recomputes, often exponential
Memoized recursion	top-down + cache	yes	each subproblem once
Bottom-up DP	bottom-up (small → large)	yes	each subproblem once

When should the shape of a problem even suggest DP? Watch for these keyword tells: max/min, longest/shortest, count the number of ways, or a yes/no / true-false feasibility question over a sequence. One honest caveat the notes stress: DP hands you the optimal value, not the plan that achieves it. If a problem wants the actual assignment — the specific cuts, the specific sentence — you either backtrack the filled table or drop back to DFS. Hold that thought; it decides Word Break II later.

Fibonacci: One Recurrence, Three Bodies

Fibonacci (and its twin, Climbing Stairs — ways(n) = ways(n-1) + ways(n-2)) is the smallest possible DP and the cleanest demo of the optimization step. The recurrence is fixed: f(n) = f(n-1) + f(n-2), base cases f(0)=0, f(1)=1. Only the implementation changes.

Naïve recursion is a two-line function that is also a performance trap: f(5) recomputes f(3) twice, f(2) three times, and the call tree balloons to O(2ⁿ). An O(n) table fixes that — fill dp[0..n] left to right. But look at the transition: dp[i] only ever reads dp[i-1] and dp[i-2]. Two variables suffice, so we reduce the dimension to O(1) space:

Java — Fibonacci, O(n) time / O(1) space

public long fib(int n) {
    if (n < 0) throw new IllegalArgumentException();
    if (n <= 1) return n;                // base cases f(0)=0, f(1)=1
    long pp = 0, p = 1, ans = 1;    // window over the two predecessors
    for (int i = 2; i <= n; i++) {
        ans = p + pp;                     // f(i) = f(i-1) + f(i-2)
        pp = p;                           // slide the window forward one step
        p = ans;
    }
    return ans;
}

The lesson generalizes past Fibonacci: DP optimization almost never improves time — it buys back space by collapsing dimensions. Find which prior cells the transition touches, keep only those, and throw the rest of the table away.

The Recursion → Memoization → DP Ladder (LC 120 Triangle)

If you internalize one problem this session, make it LC 120 Triangle: given a triangle of numbers, find the minimum path sum from top to bottom, each step moving to an adjacent number one row down. It is the single clearest demonstration of how a brute-force recursion becomes a one-line DP.

First a warm-up on building rows. LC 118 Pascal's Triangle is pure induction — each interior entry is pre.get(j-1) + pre.get(j), edges are 1. LC 119 Pascal's Triangle II asks for a single row in O(k) space; the trick is to update one list in place from right to left so each cell is overwritten only after its left neighbour has already contributed. Same idea that lets us shrink Triangle's table to 1D.

Now the four solutions to Triangle, in order of enlightenment:

S1 — DFS over all paths. From (i, j) recurse into (i+1, j) and (i+1, j+1), tracking a running sum, updating a global min at the base row. Correct, but the two branches overlap: a node deep in the triangle is reached by many paths, so cost is 1 + 2 + 2² + … = O(2ⁿ), not O(n²).
S2 — divide & conquer recursion. Return min(left, right) + triangle[i][j] instead of threading a sum through parameters. Cleaner, still O(2ⁿ) — same overlapping subproblems.
S3 — memoized recursion (pruning). Add a dp[i][j] cache initialized to -1 and short-circuit: if (dp[i][j] != -1) return dp[i][j];. That one line collapses the exponential tree to O(n²) — every node is computed once. This is top-down DP.
S4 — bottom-up DP. Start from the last row (each node's min-path is itself) and roll upward. Because you always build children before parents, no pruning check is ever needed — the dependency order guarantees freshness.

And bottom-up compresses beautifully to a single 1D array, because row k only needs row k+1:

Java — LC 120, bottom-up 1D DP

public int minimumTotal(List<List<Integer>> triangle) {
    int n = triangle.size();
    int[] dp = new int[n + 1];         // extra slot keeps dp[j+1] in-bounds on the base row
    for (int i = n - 1; i >= 0; i--) {   // build upward from the bottom
        for (int j = 0; j <= i; j++) {    // row i has i+1 elements
            dp[j] = Math.min(dp[j], dp[j + 1]) + triangle.get(i).get(j);
        }
    }
    return dp[0];
}

why bottom-up needs no memo check

Top-down recursion discovers subproblems in an unpredictable order, so it needs the dp[i][j] != -1 guard to avoid recomputation. Bottom-up visits states in dependency order by construction — every value it reads is already final. That is the deeper reason interviewers nudge you from memoized recursion toward a clean loop: the loop is the topological sort of the recursion.

Sequence DP: Longest Runs

The first genuine family is sequence DP, where dp[i] summarizes something about the prefix ending at index i. The cleanest example: the length of the longest ascending run in an unsorted array. Here substring means contiguous and subsequence means order-preserving but gappy — a distinction interviewers deliberately blur.

For the contiguous version, dp[i] = length of the longest ascending run ending at i: if arr[i] > arr[i-1] then dp[i] = dp[i-1] + 1, else it resets to 1; the answer is the running max. Since dp[i] reads only dp[i-1], one variable replaces the array — O(1) space.

The subsequence version is the famous LC 300 Longest Increasing Subsequence, and it earns its own trick. The O(n²) DP (dp[i] = longest LIS ending at i, scanning all j < i) is fine, but the elegant O(n log n) solution maintains a tails array — tails[k] is the smallest possible tail of any increasing subsequence of length k+1 — and binary-searches each new element's insertion point. The array being searched is built by the algorithm itself and stays sorted by construction, a pattern we first met back in binary search as a subroutine.

Sequence DP: Kadane and Kin

Three back-to-back classics that all live in O(1) rolling state once you nail the definition.

LC 53 Maximum Subarray is the archetype (Kadane's algorithm). Define dp[i] = the max subarray sum ending at i. The choice is binary: extend the previous run or start fresh — dp[i] = nums[i] + max(dp[i-1], 0). Answer is max(dp[i]). The max(dp[i-1], 0) is the whole idea: a negative prefix is worse than nothing, so drop it.

LC 152 Maximum Product Subarray looks identical and is a beautiful trap. Multiplication makes sign matter: a large negative product becomes the best product the instant it meets another negative. So you must carry both the running max and the running min:

Java — LC 152, carry max and min

public int maxProduct(int[] nums) {
    if (nums == null || nums.length == 0) return 0;
    int maxPre = nums[0], minPre = nums[0], globalMax = nums[0];
    for (int i = 1; i < nums.length; i++) {
        int a = maxPre * nums[i], b = minPre * nums[i];
        int max = Math.max(nums[i], Math.max(a, b));   // best ending here
        int min = Math.min(nums[i], Math.min(a, b));   // worst ending here — a future negative may flip it
        globalMax = Math.max(globalMax, max);
        maxPre = max;                                // carry both extremes forward
        minPre = min;
    }
    return globalMax;
}

LC 198 House Robber is the same rolling shape wearing a story: you cannot rob two adjacent houses, so dp[i] = max(dp[i-1], dp[i-2] + nums[i]) — skip this house, or take it plus the best up to two back. Two variables (pre, prepre) walk the array in O(1) space. Once you see Fibonacci's window here, the "two-back dependency" pattern becomes instantly recognizable.

Two-Sequence DP: Everything Becomes a Grid

When the input is two sequences and the answer depends on aligning them, the state goes 2D: dp[i][j] over a grid, i indexing the first sequence, j the second. The transition almost always reads three neighbours — diagonal, up, left.

LC 72 Edit Distance is the canonical grid: minimum insert/delete/replace operations to turn word1 into word2. Define dp[i][j] = edit distance between the first i chars of word1 and first j of word2. The first row and column are free (turning a string into "" costs one delete per char, and vice versa). Then each move has a meaning: up = delete, left = insert, diagonal = replace. If the current chars match, the diagonal is free; otherwise take the cheapest of the three and add one.

Java — LC 72, the three-direction grid

public int minDistance(String word1, String word2) {
    int l1 = word1.length(), l2 = word2.length();
    int[][] dp = new int[l1 + 1][l2 + 1];
    for (int i = 0; i <= l1; i++) dp[i][0] = i;   // delete i chars to reach ""
    for (int j = 0; j <= l2; j++) dp[0][j] = j;   // insert j chars from ""

    for (int i = 1; i <= l1; i++) {
        for (int j = 1; j <= l2; j++) {
            if (word1.charAt(i - 1) == word2.charAt(j - 1)) {
                dp[i][j] = dp[i - 1][j - 1];        // chars match — carry the diagonal
            } else {
                int replace = dp[i - 1][j - 1];
                int delete  = dp[i - 1][j];
                int insert  = dp[i][j - 1];
                dp[i][j] = Math.min(replace, Math.min(delete, insert)) + 1;
            }
        }
    }
    return dp[l1][l2];
}

Because dp[i][j] depends only on the previous row plus the just-computed cell to its left, the grid optimizes to O(n) space with a single 1D array and one saved prev value — the exact same dimension-collapse we did for Fibonacci.

The grid recurs everywhere:

LC 161 One Edit Distance — a lightweight cousin you solve without a table. Walk to the first differing char, then branch on three cases: equal lengths → the rest after replacing must match; longer word1 → skip one char in word1; longer word2 → skip one in word2. Watch the i+1 out-of-bounds edge.
Minimum steps to a palindrome — for every cut point, reverse the right half and run Edit Distance against the left. The grid is a subroutine inside a scan.
LC 115 Distinct Subsequences — count how many times T appears as a subsequence of S. Grid mem[i][j] = occurrences of T[0..i) in S[0..j); first row all 1 (the empty string is a subsequence exactly once), first column 0. If S[j]==T[i] you sum "use this char" plus "skip it" (mem[i][j] + mem[i+1][j]), else you inherit the skip case.
LC 97 Interleaving String — is s3 an interleaving of s1 and s2? Picture a grid whose axes are s1 and s2; the answer is whether a path exists from (0,0) to (len1, len2). Solvable three ways — BFS over the grid, DFS with a visited set, or the clean dp[i][j] that ORs "came from up matching s1" with "came from left matching s2".

One string outlier from the same session worth a line: LC 187 Repeated DNA Sequences isn't DP at all — it's a rolling hash / bitmask sliding a 10-char window and encoding each nucleotide in two bits, so each window is an int you can look up in O(1). Different tool, same "reuse the previous computation" instinct that powers DP.

Word Break: DP Decides, DFS Enumerates

LC 139 Word Break asks a yes/no question — can s be segmented into dictionary words? — which is a textbook true/false DP. Define dp[i] = "is the prefix of length i breakable?", with dp[0] = true (the empty prefix). Then dp[i] is true if some split point j has dp[j] true and s[j..i) in the dictionary.

Java — LC 139, boolean prefix DP

public boolean wordBreak(String s, List<String> wordDict) {
    Set<String> dict = new HashSet<>(wordDict);   // O(1) lookups, not O(n) on a List
    boolean[] dp = new boolean[s.length() + 1];
    dp[0] = true;                                   // empty prefix is always breakable
    for (int i = 1; i <= s.length(); i++) {
        for (int j = 0; j < i; j++) {
            if (dp[j] && dict.contains(s.substring(j, i))) {
                dp[i] = true;
                break;                             // one valid split is enough
            }
        }
    }
    return dp[s.length()];
}

LC 140 Word Break II changes one word — "return all sentences" — and that word breaks DP. This is exactly the "DP gives the value, not the plan" limit from the opening: a boolean table can't enumerate paths. The fix is DFS with memoized pruning: recurse from each index, append each matching word, and keep a hasResult[] flag so that once you learn an index leads nowhere, you never explore it again. DP prunes the decision; DFS reconstructs the evidence.

When Greedy Beats DP

Greedy is the discipline of committing to the locally best choice and never looking back. Its precondition is strict and worth stating out loud: greedy is valid only when every locally optimal choice provably leads to a global optimum. When that holds, you skip the entire table.

LC 55 Jump Game — can you reach the last index, given each cell's max jump length? Three tiers of solution tell the whole story. Naïve recursion tries every jump, O(branchⁿ). A boolean DP marks each index reachable if any index that can jump to it is reachable, O(n²)-ish. But the greedy insight collapses it to O(n): sweep left to right tracking the farthest index reachable so far; if your position ever exceeds that reach you fail, and if the reach covers the last index you win. You never decide where to jump — only how far you could.

LC 45 Jump Game II asks for the minimum number of jumps. The O(n²) DP (dp[i] = min jumps from i to the end) works, but the O(n) greedy is a disguised BFS by layers: each "jump" is a frontier, and you advance the layer boundary only when the current one is exhausted.

Java — LC 45, greedy BFS-by-layers

public int jump(int[] nums) {
    int steps = 0, curEnd = 0, farthest = 0;
    for (int i = 0; i < nums.length - 1; i++) {   // stop before the last index
        farthest = Math.max(farthest, i + nums[i]);   // widen the current frontier
        if (i == curEnd) {                            // exhausted this jump's reach
            steps++;                                  // commit one more jump
            curEnd = farthest;                        // its reach is the new layer boundary
        }
    }
    return steps;
}

The cut-rope problem (maximize the product of integer parts a rope is cut into) makes the same DP-then-greedy jump. The DP is honest — dp[i] = best product for length i, trying every first cut j <= i/2 as max(j, dp[j]) * max(i-j, dp[i-j]) — at O(n²). But a math observation kills it: to maximize a product with a fixed sum you want factors as equal as possible, and integer-wise threes win. So the greedy answer just peels off 3s (keeping a final 2 or 4 whole), giving O(n):

Handle n = 2 and n = 3 explicitly (must cut at least once → n-1).
While n > 4, subtract 3 and multiply the result by 3.
Multiply whatever remains (2, 3, or 4) at the end.

Other members of the greedy hall of fame from the notes — Candy, Gas Station — follow the same litmus test: prove the local rule forces the global optimum, then trust it. If you can construct a counterexample where a locally good move loses globally, greedy is simply the wrong tool and you retreat to DP.

Problem Checklist

Problem	Pattern	Complexity
Fibonacci / Climbing Stairs	rolling 1D induction	O(n) time, O(1) space
LC 118 / 119 Pascal's Triangle	row-building DP, in-place row	O(k²) / O(k) space
LC 120 Triangle	recursion → memo → bottom-up 1D	O(n²)
Longest Ascending Run	dp[i] = dp[i-1]+1 or reset	O(n), O(1)
LC 300 Longest Increasing Subseq.	tails array + binary search	O(n log n)
LC 53 Maximum Subarray	Kadane, max(dp[i-1],0)	O(n), O(1)
LC 152 Maximum Product Subarray	carry max & min together	O(n), O(1)
LC 198 House Robber	pre / prepre two-back rolling	O(n), O(1)
LC 72 Edit Distance	2D grid, delete/insert/replace	O(mn) → O(n)
LC 161 One Edit Distance	first-diff casework	O(n)
LC 115 Distinct Subsequences	2D count grid	O(mn)
LC 97 Interleaving String	2D path grid (or BFS/DFS)	O(mn)
LC 187 Repeated DNA	rolling hash / 2-bit mask	O(n)
LC 139 Word Break	boolean prefix DP	O(n² · L)
LC 140 Word Break II	DFS + memo (enumerate)	exponential output
Cut Rope (max product)	DP → cut-into-3s math	O(n²) → O(n)
LC 55 Jump Game	greedy farthest reach	O(n)
LC 45 Jump Game II	greedy BFS by layers	O(n)

takeaway

Dynamic programming is induction that keeps a memo: define dp[i] in one sentence, write the base case, express the transition from smaller states, read the answer cell, then compress space by seeing which cells the transition touches. Memoized recursion and bottom-up loops compute the same thing — the loop is just the recursion sorted into dependency order. Two sequences mean a dp[i][j] grid with diagonal/up/left moves. And when each locally optimal choice provably forces the global one, drop the table and go greedy — Jump Game and cut-rope are the tells.

🎯 interview hot-takes

What's the five-step DP recipe? Definition (what dp[i] means), base case, transition, answer cell, space optimization. Nail step 1 first — everything else depends on it.
DP vs recursion — same thing? Same recurrence, opposite directions. Recursion runs top-down and stores nothing (exponential on overlap); DP runs bottom-up and caches every subresult. Memoized recursion is the bridge, and LC 120 Triangle shows all three.
When does greedy beat DP? When every locally optimal choice provably forces the global optimum, so you needn't remember alternatives — Jump Game drops O(n²) to O(n) by tracking the farthest reachable index.
How do you spot two-sequence DP? Two strings/arrays that must be aligned → a dp[i][j] grid whose transition reads diagonal, up, and left. Edit Distance, Distinct Subsequences, Interleaving String are one grid with three rules.
Can DP recover the actual plan, not just the number? Not directly — DP yields the optimal value. To enumerate the actual solutions you backtrack the table or switch to DFS, which is exactly why Word Break II abandons the boolean DP for memoized DFS.