cs 61a cs 98 52
play

CS 61A/CS 98-52 Mehrdad Niknami University of California, Berkeley - PowerPoint PPT Presentation

CS 61A/CS 98-52 Mehrdad Niknami University of California, Berkeley Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 1 / 25 Warning Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 2 / 25 Warning FYI: Mehrdad Niknami (UC Berkeley) CS 61A/CS


  1. 0.4 0.2 1.0 0.8 0.6 0.4 0.2 0.0 1.0 0.8 0.6 0.0 Brachistochrone Problem Joking aside... Most problems don’t have a nice formula, so you’ll need algorithms. Let’s get our hands dirty! Remember Riemann sums? This is similar: 1 Chop up the ramp into line segments (but hold ends fixed) 2 Move around the anchors to minimize travel time Q: How do you do this? Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 6 / 25

  2. 0.4 0.2 1.0 0.8 0.6 0.4 0.2 0.0 1.0 0.8 0.6 0.0 Brachistochrone Problem Joking aside... Most problems don’t have a nice formula, so you’ll need algorithms. Let’s get our hands dirty! Remember Riemann sums? This is similar: 1 Chop up the ramp into line segments (but hold ends fixed) 2 Move around the anchors to minimize travel time Q: How do you do this? Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 6 / 25

  3. Algorithm Use Newton-Raphson! Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 7 / 25

  4. Algorithm Use Newton-Raphson! ...but wasn’t that for finding roots ? Not optimizing? Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 7 / 25

  5. Algorithm Use Newton-Raphson! ...but wasn’t that for finding roots ? Not optimizing? Actually, it’s used for both: Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 7 / 25

  6. Algorithm Use Newton-Raphson! ...but wasn’t that for finding roots ? Not optimizing? Actually, it’s used for both: If F is differentiable, minimizing F reduces to root-finding : F ′ ( x ) = f ( x ) = 0 Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 7 / 25

  7. Algorithm Use Newton-Raphson! ...but wasn’t that for finding roots ? Not optimizing? Actually, it’s used for both: If F is differentiable, minimizing F reduces to root-finding : F ′ ( x ) = f ( x ) = 0 Caveat: must avoid maxima and inflection points Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 7 / 25

  8. Algorithm Use Newton-Raphson! ...but wasn’t that for finding roots ? Not optimizing? Actually, it’s used for both: If F is differentiable, minimizing F reduces to root-finding : F ′ ( x ) = f ( x ) = 0 Caveat: must avoid maxima and inflection points Easy in 1-D: only ± directions to check for increase/decrease Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 7 / 25

  9. Algorithm Use Newton-Raphson! ...but wasn’t that for finding roots ? Not optimizing? Actually, it’s used for both: If F is differentiable, minimizing F reduces to root-finding : F ′ ( x ) = f ( x ) = 0 Caveat: must avoid maxima and inflection points Easy in 1-D: only ± directions to check for increase/decrease Good luck in N -D... infinitely many directions Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 7 / 25

  10. Algorithm Newton-Raphson method for optimization : 1 Why are quadratics common? Energy/cost are quadratic ( K = 1 2 mv 2 , P = I 2 R ...) 2 You’ll see linearization ALL the time in engineering Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 8 / 25

  11. Algorithm Newton-Raphson method for optimization : 1 Assume F is approximately quadratic 1 (so f = F ′ approx. linear) 1 Why are quadratics common? Energy/cost are quadratic ( K = 1 2 mv 2 , P = I 2 R ...) 2 You’ll see linearization ALL the time in engineering Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 8 / 25

  12. Algorithm Newton-Raphson method for optimization : 1 Assume F is approximately quadratic 1 (so f = F ′ approx. linear) 2 Guess some x 0 intelligently 1 Why are quadratics common? Energy/cost are quadratic ( K = 1 2 mv 2 , P = I 2 R ...) 2 You’ll see linearization ALL the time in engineering Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 8 / 25

  13. Algorithm Newton-Raphson method for optimization : 1 Assume F is approximately quadratic 1 (so f = F ′ approx. linear) 2 Guess some x 0 intelligently 3 Repeatedly solve linear approximation 2 of f = F ′ : f ( x k ) − f ( x k +1 ) = f ′ ( x k ) ( x k − x k +1 ) f ( x k +1 ) = 0 x k +1 = x k − f ′ ( x k ) − 1 f ( x k ) = ⇒ 1 Why are quadratics common? Energy/cost are quadratic ( K = 1 2 mv 2 , P = I 2 R ...) 2 You’ll see linearization ALL the time in engineering Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 8 / 25

  14. Algorithm Newton-Raphson method for optimization : 1 Assume F is approximately quadratic 1 (so f = F ′ approx. linear) 2 Guess some x 0 intelligently 3 Repeatedly solve linear approximation 2 of f = F ′ : f ( x k ) − f ( x k +1 ) = f ′ ( x k ) ( x k − x k +1 ) f ( x k +1 ) = 0 x k +1 = x k − f ′ ( x k ) − 1 f ( x k ) = ⇒ We ignored F ! 1 Why are quadratics common? Energy/cost are quadratic ( K = 1 2 mv 2 , P = I 2 R ...) 2 You’ll see linearization ALL the time in engineering Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 8 / 25

  15. Algorithm Newton-Raphson method for optimization : 1 Assume F is approximately quadratic 1 (so f = F ′ approx. linear) 2 Guess some x 0 intelligently 3 Repeatedly solve linear approximation 2 of f = F ′ : f ( x k ) − f ( x k +1 ) = f ′ ( x k ) ( x k − x k +1 ) f ( x k +1 ) = 0 x k +1 = x k − f ′ ( x k ) − 1 f ( x k ) = ⇒ We ignored F ! Avoid maxima and inflection points! 1 Why are quadratics common? Energy/cost are quadratic ( K = 1 2 mv 2 , P = I 2 R ...) 2 You’ll see linearization ALL the time in engineering Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 8 / 25

  16. Algorithm Newton-Raphson method for optimization : 1 Assume F is approximately quadratic 1 (so f = F ′ approx. linear) 2 Guess some x 0 intelligently 3 Repeatedly solve linear approximation 2 of f = F ′ : f ( x k ) − f ( x k +1 ) = f ′ ( x k ) ( x k − x k +1 ) f ( x k +1 ) = 0 x k +1 = x k − f ′ ( x k ) − 1 f ( x k ) = ⇒ We ignored F ! Avoid maxima and inflection points! (How?) 1 Why are quadratics common? Energy/cost are quadratic ( K = 1 2 mv 2 , P = I 2 R ...) 2 You’ll see linearization ALL the time in engineering Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 8 / 25

  17. Algorithm Newton-Raphson method for optimization : 1 Assume F is approximately quadratic 1 (so f = F ′ approx. linear) 2 Guess some x 0 intelligently 3 Repeatedly solve linear approximation 2 of f = F ′ : f ( x k ) − f ( x k +1 ) = f ′ ( x k ) ( x k − x k +1 ) f ( x k +1 ) = 0 x k +1 = x k − f ′ ( x k ) − 1 f ( x k ) = ⇒ We ignored F ! Avoid maxima and inflection points! (How?) 4 ...Profit? 1 Why are quadratics common? Energy/cost are quadratic ( K = 1 2 mv 2 , P = I 2 R ...) 2 You’ll see linearization ALL the time in engineering Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 8 / 25

  18. Algorithm Wait, but we have a function of many variables. What do? Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 9 / 25

  19. Algorithm Wait, but we have a function of many variables. What do? A couple options: Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 9 / 25

  20. Algorithm Wait, but we have a function of many variables. What do? A couple options: 1 Fully multivariate Newton-Raphson: x k − � ∇ � x k ) − 1 � � x k +1 = � f ( � f ( � x k ) Taught in EE 219A, 227C, 144/244, etc... (need Math 53 and 54) Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 9 / 25

  21. Algorithm Wait, but we have a function of many variables. What do? A couple options: 1 Fully multivariate Newton-Raphson: x k − � ∇ � x k ) − 1 � � x k +1 = � f ( � f ( � x k ) Taught in EE 219A, 227C, 144/244, etc... (need Math 53 and 54) 2 Newton coordinate-descent Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 9 / 25

  22. Algorithm Coordinate descent: Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 10 / 25

  23. Algorithm Coordinate descent: 1 Take x 1 , use it to minimize F , holding others fixed Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 10 / 25

  24. Algorithm Coordinate descent: 1 Take x 1 , use it to minimize F , holding others fixed 2 Take y 1 , use it to minimize F , holding others fixed Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 10 / 25

  25. Algorithm Coordinate descent: 1 Take x 1 , use it to minimize F , holding others fixed 2 Take y 1 , use it to minimize F , holding others fixed 3 Take x 2 , use it to minimize F , holding others fixed Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 10 / 25

  26. Algorithm Coordinate descent: 1 Take x 1 , use it to minimize F , holding others fixed 2 Take y 1 , use it to minimize F , holding others fixed 3 Take x 2 , use it to minimize F , holding others fixed 4 Take y 2 , use it to minimize F , holding others fixed Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 10 / 25

  27. Algorithm Coordinate descent: 1 Take x 1 , use it to minimize F , holding others fixed 2 Take y 1 , use it to minimize F , holding others fixed 3 Take x 2 , use it to minimize F , holding others fixed 4 Take y 2 , use it to minimize F , holding others fixed 5 . . . Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 10 / 25

  28. Algorithm Coordinate descent: 1 Take x 1 , use it to minimize F , holding others fixed 2 Take y 1 , use it to minimize F , holding others fixed 3 Take x 2 , use it to minimize F , holding others fixed 4 Take y 2 , use it to minimize F , holding others fixed 5 . . . 6 Cycle through again Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 10 / 25

  29. Algorithm Coordinate descent: 1 Take x 1 , use it to minimize F , holding others fixed 2 Take y 1 , use it to minimize F , holding others fixed 3 Take x 2 , use it to minimize F , holding others fixed 4 Take y 2 , use it to minimize F , holding others fixed 5 . . . 6 Cycle through again Doesn’t work as often, but it works very well here. Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 10 / 25

  30. Algorithm Newton step for minimization : Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 11 / 25

  31. Algorithm Newton step for minimization : def newton_minimizer_step(F, coords, h): delta = 0.0 for i in range(1, len(coords) - 1): for j in range(len(coords[i])): def f(c): return derivative(F, c, i, j, h) def df(c): return derivative(f, c, i, j, h) step = -f(coords) / df(coords) delta += abs(step) coords[i][j] += step return delta Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 11 / 25

  32. Algorithm Newton step for minimization : def newton_minimizer_step(F, coords, h): delta = 0.0 for i in range(1, len(coords) - 1): for j in range(len(coords[i])): def f(c): return derivative(F, c, i, j, h) def df(c): return derivative(f, c, i, j, h) step = -f(coords) / df(coords) delta += abs(step) coords[i][j] += step return delta Side note: Notice a potential bug? Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 11 / 25

  33. Algorithm Newton step for minimization : def newton_minimizer_step(F, coords, h): delta = 0.0 for i in range(1, len(coords) - 1): for j in range(len(coords[i])): def f(c): return derivative(F, c, i, j, h) def df(c): return derivative(f, c, i, j, h) step = -f(coords) / df(coords) delta += abs(step) coords[i][j] += step return delta Side note: Notice a potential bug? What’s the fix? Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 11 / 25

  34. Algorithm Newton step for minimization : def newton_minimizer_step(F, coords, h): delta = 0.0 for i in range(1, len(coords) - 1): for j in range(len(coords[i])): def f(c): return derivative(F, c, i, j, h) def df(c): return derivative(f, c, i, j, h) step = -f(coords) / df(coords) delta += abs(step) coords[i][j] += step return delta Side note: Notice a potential bug? What’s the fix? Notice a 33% inefficiency? Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 11 / 25

  35. Algorithm Newton step for minimization : def newton_minimizer_step(F, coords, h): delta = 0.0 for i in range(1, len(coords) - 1): for j in range(len(coords[i])): def f(c): return derivative(F, c, i, j, h) def df(c): return derivative(f, c, i, j, h) step = -f(coords) / df(coords) delta += abs(step) coords[i][j] += step return delta Side note: Notice a potential bug? What’s the fix? Notice a 33% inefficiency? What’s the fix? Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 11 / 25

  36. Algorithm Computing derivatives numerically: Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 12 / 25

  37. Algorithm Computing derivatives numerically: def derivative(f, coords, i, j, h): x = coords[i][j] coords[i][j] = x + h; f2 = f(coords) coords[i][j] = x - h; f1 = f(coords) coords[i][j] = x return (f2 - f1) / (2 * h) Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 12 / 25

  38. Algorithm Computing derivatives numerically: def derivative(f, coords, i, j, h): x = coords[i][j] coords[i][j] = x + h; f2 = f(coords) coords[i][j] = x - h; f1 = f(coords) coords[i][j] = x return (f2 - f1) / (2 * h) Why not (f(x + h) - f(x)) / h ? Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 12 / 25

  39. Algorithm Computing derivatives numerically: def derivative(f, coords, i, j, h): x = coords[i][j] coords[i][j] = x + h; f2 = f(coords) coords[i][j] = x - h; f1 = f(coords) coords[i][j] = x return (f2 - f1) / (2 * h) Why not (f(x + h) - f(x)) / h ? Breaking the intrinsic asymmetry reduces accuracy Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 12 / 25

  40. Algorithm Computing derivatives numerically: def derivative(f, coords, i, j, h): x = coords[i][j] coords[i][j] = x + h; f2 = f(coords) coords[i][j] = x - h; f1 = f(coords) coords[i][j] = x return (f2 - f1) / (2 * h) Why not (f(x + h) - f(x)) / h ? Breaking the intrinsic asymmetry reduces accuracy ∼ Words of Wisdom ∼ If your problem has { fundamental feature } that your solution doesn’t, you’ve created more problems. Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 12 / 25

  41. Algorithm What is our objective function F to minimize? Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 13 / 25

  42. Algorithm What is our objective function F to minimize? def falling_time(coords): # coords = [[x1,y1], [x2,y2], ...] t, speed = 0.0, 0.0 prev = None for coord in coords: if prev != None: dy = coord[1] - prev[1] d = ((coord[0] - prev[0]) ** 2 + dy ** 2) ** 0.5 accel = -9.80665 * dy / d for dt in quadratic_roots(accel, speed, -d): if dt > 0: speed += accel * dt t += dt prev = coord return t Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 13 / 25

  43. Algorithm Let’s define quadratic roots ... Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 14 / 25

  44. Algorithm Let’s define quadratic roots ... def quadratic_roots(two_a, b, c): D = b * b - 2 * two_a * c if D >= 0: if D > 0: r = D ** 0.5 roots = [(-b + r) / two_a, (-b - r) / two_a] else: roots = [-b / two_a] else: roots = [] return roots Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 14 / 25

  45. Algorithm Aaaaaand put it all together Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 15 / 25

  46. Algorithm Aaaaaand put it all together def main(n=6): (y1, y2) = (1.0, 0.0) (x1, x2) = (0.0, 1.0) coords = [ # initial guess: straight line [x1 + (x2 - x1) * i / n, y1 + (y2 - y1) * i / n] for i in range(n + 1) ] f = falling_time h = 0.00001 while newton_minimizer_step(f, coords, h) > 0.01: print(coords) if __name__ == '__main__': main() Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 15 / 25

  47. Algorithm (Demo) Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 16 / 25

  48. Analysis Error analysis: Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 17 / 25

  49. Analysis Error analysis: If x ∞ is the root and ǫ k = x k − x ∞ is the error, then: Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 17 / 25

  50. Analysis Error analysis: If x ∞ is the root and ǫ k = x k − x ∞ is the error, then: ( x k +1 − x ∞ ) = ( x k − x ∞ ) − f ( x k ) (Newton step) f ′ ( x k ) ǫ k +1 = ǫ k − f ( x k ) (error step) f ′ ( x k ) f ( x ∞ ) + ǫ k f ′ ( x ∞ ) + 1 2 ǫ 2 k f ′′ ( x ∞ ) + · · · ǫ k +1 = ǫ k − ✘✘✘ ✘ (Taylor series) f ′ ( x ∞ ) + ǫ k f ′′ ( x ∞ ) + · · · 1 2 ǫ 2 k f ′′ ( x ∞ ) + · · · ǫ k +1 = (simplify) f ′ ( x ∞ ) + ǫ k f ′′ ( x ∞ ) + · · · Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 17 / 25

  51. Analysis Error analysis: If x ∞ is the root and ǫ k = x k − x ∞ is the error, then: ( x k +1 − x ∞ ) = ( x k − x ∞ ) − f ( x k ) (Newton step) f ′ ( x k ) ǫ k +1 = ǫ k − f ( x k ) (error step) f ′ ( x k ) f ( x ∞ ) + ǫ k f ′ ( x ∞ ) + 1 2 ǫ 2 k f ′′ ( x ∞ ) + · · · ǫ k +1 = ǫ k − ✘✘✘ ✘ (Taylor series) f ′ ( x ∞ ) + ǫ k f ′′ ( x ∞ ) + · · · 1 2 ǫ 2 k f ′′ ( x ∞ ) + · · · ǫ k +1 = (simplify) f ′ ( x ∞ ) + ǫ k f ′′ ( x ∞ ) + · · · As ǫ k → 0, the “ · · · ” terms are quickly dominated. Therefore: Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 17 / 25

  52. Analysis Error analysis: If x ∞ is the root and ǫ k = x k − x ∞ is the error, then: ( x k +1 − x ∞ ) = ( x k − x ∞ ) − f ( x k ) (Newton step) f ′ ( x k ) ǫ k +1 = ǫ k − f ( x k ) (error step) f ′ ( x k ) f ( x ∞ ) + ǫ k f ′ ( x ∞ ) + 1 2 ǫ 2 k f ′′ ( x ∞ ) + · · · ǫ k +1 = ǫ k − ✘✘✘ ✘ (Taylor series) f ′ ( x ∞ ) + ǫ k f ′′ ( x ∞ ) + · · · 1 2 ǫ 2 k f ′′ ( x ∞ ) + · · · ǫ k +1 = (simplify) f ′ ( x ∞ ) + ǫ k f ′′ ( x ∞ ) + · · · As ǫ k → 0, the “ · · · ” terms are quickly dominated. Therefore: If f ′ ( x ∞ ) ≈ 0, then ǫ k +1 ∝ ǫ k (slow: # of correct digits adds ) Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 17 / 25

  53. Analysis Error analysis: If x ∞ is the root and ǫ k = x k − x ∞ is the error, then: ( x k +1 − x ∞ ) = ( x k − x ∞ ) − f ( x k ) (Newton step) f ′ ( x k ) ǫ k +1 = ǫ k − f ( x k ) (error step) f ′ ( x k ) f ( x ∞ ) + ǫ k f ′ ( x ∞ ) + 1 2 ǫ 2 k f ′′ ( x ∞ ) + · · · ǫ k +1 = ǫ k − ✘✘✘ ✘ (Taylor series) f ′ ( x ∞ ) + ǫ k f ′′ ( x ∞ ) + · · · 1 2 ǫ 2 k f ′′ ( x ∞ ) + · · · ǫ k +1 = (simplify) f ′ ( x ∞ ) + ǫ k f ′′ ( x ∞ ) + · · · As ǫ k → 0, the “ · · · ” terms are quickly dominated. Therefore: If f ′ ( x ∞ ) ≈ 0, then ǫ k +1 ∝ ǫ k (slow: # of correct digits adds ) Otherwise, we have ǫ k +1 ∝ ǫ 2 k (fast: # of correct digits doubles ) Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 17 / 25

  54. 0.5 0.5 1.0 1.5 1.0 0.5 Analysis Some failure modes: Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 18 / 25

  55. 0.5 0.5 1.0 1.5 1.0 0.5 Analysis Some failure modes: f is flat near root: too slow Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 18 / 25

  56. 0.5 0.5 1.0 1.5 1.0 0.5 Analysis Some failure modes: f is flat near root: too slow f ′ ( x ) ≈ 0 = shoots off into infinity (n.b. if x != 0 not a solution) Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 18 / 25

  57. 0.5 1.0 Analysis Some failure modes: f is flat near root: too slow f ′ ( x ) ≈ 0 = shoots off into infinity (n.b. if x != 0 not a solution) Stable oscillation trap - 0.5 - 0.5 - 1.0 - 1.5 Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 18 / 25

  58. 0.5 1.0 Analysis Some failure modes: f is flat near root: too slow f ′ ( x ) ≈ 0 = shoots off into infinity (n.b. if x != 0 not a solution) Stable oscillation trap - 0.5 - 0.5 - 1.0 - 1.5 Intuition: Think adversarially : create “tricky” f that looks root-less Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 18 / 25

  59. 0.5 1.0 Analysis Some failure modes: f is flat near root: too slow f ′ ( x ) ≈ 0 = shoots off into infinity (n.b. if x != 0 not a solution) Stable oscillation trap - 0.5 - 0.5 - 1.0 - 1.5 Intuition: Think adversarially : create “tricky” f that looks root-less Obviously this is possible... just put the root far away Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 18 / 25

  60. 0.5 1.0 Analysis Some failure modes: f is flat near root: too slow f ′ ( x ) ≈ 0 = shoots off into infinity (n.b. if x != 0 not a solution) Stable oscillation trap - 0.5 - 0.5 - 1.0 - 1.5 Intuition: Think adversarially : create “tricky” f that looks root-less Obviously this is possible... just put the root far away Therefore Newton-Raphson can’t be foolproof Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 18 / 25

  61. Final thoughts Notes: There are subtleties I brushed under the rug: Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 19 / 25

  62. Final thoughts Notes: There are subtleties I brushed under the rug: The physics is much more complicated (why?) Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 19 / 25

Recommend


More recommend