{ "cells": [ { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "**********************************************************************\n", "Newton's Method for Nonlinear Equations and Unconstrained Minimization\n", "**********************************************************************" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 1\n", "==========" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import numpy\n", "\n", "def F(x):\n", " return numpy.asfarray([2 * (x[0] + x[1])**2 + (x[0] - x[1])**2 - 8,\n", " 5 * x[0]**2 + (x[1] - 3)**2 - 9])\n", "\n", "def J(x):\n", " return numpy.asfarray([[6 * x[0] + 2 * x[1], 2 * x[0] + 6 * x[1]],\n", " [10 * x[0], 2 * x[1] - 6]])\n", "\n", "x_c = numpy.asfarray([2, 0])\n", "for i in range(2):\n", " s_k = numpy.dot(numpy.linalg.inv(J(x_c)), -F(x_c))\n", " x_k = x_c + s_k\n", " print('x_{0} = {1}'.format(i, x_c))\n", " x_c = x_k" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 2\n", "==========\n", "\n", "The Jacobian becomes ill-conditioned in the second iteration i.e. it becomes the\n", "zero matrix." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import numpy\n", "\n", "def F(x):\n", " return numpy.exp(x) - 1\n", "\n", "def J(x):\n", " return numpy.asfarray([[numpy.exp(x[0]), 0],\n", " [0, numpy.exp(x[1])]])\n", "\n", "x_c = numpy.asfarray([-10, -10])\n", "for i in range(2):\n", " _ = J(x_c)\n", " s_k = numpy.dot(numpy.linalg.inv(_), -F(x_c))\n", " x_k = x_c + s_k\n", " print('Iteration {0}'.format(i))\n", " print('x_{0} = {1}'.format(i, x_c))\n", " print('s_{0} = {1}'.format(i, s_k))\n", " print('J_{0} =\\n{1}'.format(i, _))\n", " print('x_{0} = {1}'.format(i + 1, x_k))\n", " x_c = x_k" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 3\n", "==========\n", "\n", "Define :math:`f_1(x) = x, f_2(x) = x^2 + x,` and :math:`f_3(x) = e^x - 1`.\n", "\n", "(a)\n", "---\n", "\n", ".. math::\n", "\n", " f'_1(x) = 1 \\Rightarrow f'_1(x_* = 0) = 1\n", "\n", ".. math::\n", "\n", " f'_2(x) = 2x + 1 \\Rightarrow f'_2(x_* = 0) = 1\n", "\n", ".. math::\n", "\n", " f'_3(x) = e^x \\Rightarrow f'_3(x_* = 0) = 1\n", "\n", "(b)\n", "---\n", "\n", "The Lipschitz constant for each derivative at the root :math:`x_* = 0` is\n", "\n", ".. math::\n", "\n", " \\left| f'_1(x) - f'_1(0) \\right| &\\leq \\gamma_1 \\left| x - 0 \\right|\\\\\n", " \\left| \\frac{1 - 1}{x} \\right| &\\leq \\gamma_1\\\\\n", " 0 &\\leq \\gamma_1\n", "\n", ".. math::\n", "\n", " \\left| f'_2(x) - f'_2(0) \\right| &\\leq \\gamma_2 \\left| x - 0 \\right|\\\\\n", " \\left| \\frac{2x + 1 - 1}{x} \\right| &\\leq \\gamma_2\\\\\n", " 2 &\\leq \\gamma_2\n", "\n", ".. math::\n", "\n", " \\left| f'_3(x) - f'_3(0) \\right| &\\leq \\gamma_3 \\left| x - 0 \\right|\\\\\n", " \\left| \\frac{e^x - 1}{x} \\right| &\\leq \\gamma_3\\\\\n", " &\\leq e^x \\leq \\gamma_3\n", "\n", "The last inequality holds because the Bernoulli inequality gives\n", "\n", ".. math::\n", "\n", " 1 + x \\leq \\left( 1 + \\frac{x}{n} \\right)^n\n", " \\xrightarrow{n \\rightarrow \\infty} \\exp x.\n", "\n", "Note that :math:`e^x` is an upper bound when :math:`x > 0` and becomes a lower\n", "bound when :math:`x < 0`.\n", "\n", "(c)\n", "---\n", "\n", "The derivative of each function is bounded on the interval :math:`[-a, a]`, so\n", "each function is Lipschitz continuous. The bound for each derivative is\n", "\n", ".. math::\n", "\n", " \\left| f'_1(x) \\right| &= 1\\\\\n", " &\\geq \\rho_1 = 1\n", "\n", ".. math::\n", "\n", " \\left| f'_2(x) \\right| &= \\left| 2x + 1 \\right|\\\\\n", " &\\geq \\rho_2 = 0\n", "\n", ".. math::\n", "\n", " \\left| f'_3(x) \\right| &= \\left| e^x \\right|\\\\\n", " &\\geq \\rho_3 = e^{-a}\n", "\n", "The smallest region of convergence is defined to be\n", ":math:`\\eta = \\min \\left\\{ \\hat{\\eta}, 2 \\rho / \\gamma \\right\\}` where\n", ":math:`\\hat{\\eta}` is defined to be the radius of the largest open interval\n", "around :math:`x_*`.\n", "\n", "(d)\n", "---\n", "\n", ":math:`f_1` will converge to :math:`x_* = 0` in the interval\n", ":math:`(-\\infty, \\infty)` since it only has one root and\n", ":math:`\\lim_{\\gamma \\rightarrow 0^+} \\frac{2 \\rho_1}{\\gamma_1} = \\infty`.\n", "\n", ":math:`f_2` violates the assumption that :math:`\\rho_2 > 0` at :math:`x = -0.5`\n", "causing Newton's method to fail because the derivative is zero. Splitting the\n", "interval into :math:`(-\\infty, -0.5)` and :math:`(-0.5, \\infty)` sidesteps that\n", "issue. Applying Theorem 2.4.3 to each region shows that any point in those\n", "regions will converge.\n", "\n", ":math:`f_3` will converge to :math:`x_* = 0` in the interval\n", ":math:`(-\\infty, \\infty)` since it only has one root in theory and\n", ":math:`\\lim_{a \\rightarrow \\infty} \\frac{2 \\rho_3}{\\gamma_3} =\n", "\\lim_{a \\rightarrow \\infty} 2 e^{-2a} = 0`. However, the floating-point\n", "precision restricts the region to :math:`[-6.5, 709]`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import numpy\n", "\n", "def Newtons_Method(f, f_prime, x_c):\n", " return x_c - f(x_c) / f_prime(x_c)\n", "\n", "f = [lambda x: x,\n", " lambda x: x**2 + x,\n", " lambda x: numpy.exp(x) - 1\n", " ]\n", "f_prime = [lambda x: 1,\n", " lambda x: 2 * x + 1,\n", " lambda x: numpy.exp(x)]\n", "x_c = [-7.5, -0.6, 7.5]\n", "for i in range(16):\n", " for j in range(len(x_c)):\n", " x_c[j] = Newtons_Method(f[j], f_prime[j], x_c[j])\n", " print('x_{0} = {1}'.format(i + 1, x_c))" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 4\n", "==========\n", "\n", ".. math::\n", "\n", " J(x) =\n", " \\begin{bmatrix}\n", " 1 & 0 & 0\\\\\n", " 0 & 2 x_2 + 1 & 0\\\\\n", " 0 & 0 & e^{x_3}\n", " \\end{bmatrix}\n", "\n", "(a)\n", "---\n", "\n", ".. math::\n", "\n", " J(x_*) =\n", " \\begin{bmatrix}\n", " 1 & 0 & 0\\\\\n", " 0 & 1 & 0\\\\\n", " 0 & 0 & 1\n", " \\end{bmatrix}\n", "\n", "(b)\n", "---\n", "\n", "The Lipschitz constant on :math:`J(x) \\in \\text{Lip}_\\gamma(N(x_*, a))` is\n", "\n", ".. math::\n", "\n", " \\left\\Vert J(x) - J(x_*) \\right\\Vert\n", " &\\leq \\gamma \\left\\Vert x - x_* \\right\\Vert\n", " & \\quad & \\text{(4.1.11)}\\\\\n", " \\left\\Vert\n", " \\begin{bmatrix}\n", " 0 & 0 & 0\\\\\n", " 0 & 2 x_2 & 0\\\\\n", " 0 & 0 & e^{x_3} - 1\n", " \\end{bmatrix}\n", " \\right\\Vert\n", " &< \\gamma a\n", " & \\quad & d(x_*, x) < a\\\\\n", " \\frac{\n", " \\max \\left\\{\n", " \\left\\vert 2 x_2 \\right\\vert,\n", " \\left\\vert e^{x_3} - 1 \\right\\vert\n", " \\right\\}\n", " }{a} &< \\gamma\n", " & \\quad & \\text{1-norm is an upper bound for the numerator}\n", "\n", "(c)\n", "---\n", "\n", "1-norm is the largest possible value for\n", ":math:`\\left\\Vert J(x_*)^{-1} \\right\\Vert`, so the region of convergence is\n", "defined as\n", ":math:`\\epsilon = \\min \\left\\{ a, \\frac{1}{2 \\beta \\gamma} \\right\\}` where\n", ":math:`0 < \\left\\Vert J(x_*)^{-1} \\right\\Vert \\leq 1 \\leq \\beta`. Hence\n", "\n", ".. math::\n", "\n", " \\epsilon =\n", " \\min \\left\\{\n", " a,\n", " \\frac{a}{\n", " \\max \\left\\{\n", " \\left\\vert 4 x_2 \\right\\vert, \\left\\vert 2e^{x_3} - 2 \\right\\vert\n", " \\right\\}\n", " }\n", " \\right\\}\n", "\n", "(d)\n", "---\n", "\n", "When :math:`(x_0)_3 = 0,\n", "\\epsilon = \\min \\left\\{ a, \\frac{a}{\\left\\vert 4 x_2 \\right\\vert} \\right\\}`.\n", "When :math:`(x_0)_2 = (x_0)_3 = 0, \\epsilon = a` because as\n", ":math:`(x_0)_2 \\rightarrow 0` or :math:`(x_0)_3 \\rightarrow 0`, the denominator\n", "grows infinitely large." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 5\n", "==========\n", "\n", "Proof by induction starts at the base case of :math:`x_0`, so the goal is to\n", "show :math:`x_1 = x_0 - J(x_0)^{-1} F(x_0)` is well-defined i.e. :math:`J(x_0)`\n", "is nonsingular.\n", "\n", "Since :math:`J(x_*)^{-1}` exists and\n", ":math:`\\left\\Vert J(x_*)^{-1} \\right\\Vert \\leq \\beta` where :math:`\\beta > 0` by\n", "assumption, :math:`J(x_*)` is nonsingular. Theorem 3.14 states that\n", ":math:`J(x_0)` is nonsingular when\n", ":math:`\\left\\Vert J(x_*)^{-1} \\left( J(x_0) - J(x_*) \\right) \\right\\Vert < 1`\n", "holds. Recall that :math:`J` is Holder continuous within\n", ":math:`N(x_*, r)` where :math:`r > 0` and :math:`\\alpha \\in (0, 1]` such that\n", "\n", ".. math::\n", "\n", " \\left\\Vert J(x_*)^{-1} \\left( J(x_0) - J(x_*) \\right) \\right\\Vert\n", " &\\leq \\left\\Vert\n", " J(x_*)^{-1} \\right\\Vert \\left\\Vert J(x_0) - J(x_*)\n", " \\right\\Vert\\\\\n", " &\\leq \\beta \\gamma \\left\\Vert x_0 - x_* \\right\\Vert^\\alpha\\\\\n", " &\\leq \\beta \\gamma r^\\alpha\n", "\n", "Picking :math:`\\epsilon = \\min \\left\\{ r,\n", "\\left( \\frac{\\alpha}{(1 + \\alpha) \\beta \\gamma} \\right)^{1 / \\alpha} \\right\\}`\n", "satisfies Theorem 3.14 in addition to revealing the elegant inequality\n", ":math:`\\beta \\gamma \\epsilon^\\alpha \\leq \\frac{\\alpha}{1 + \\alpha} < 1` that\n", "simplifies later derivations. Thus :math:`J(x_0)` is nonsingular and its\n", "induced matrix norm is bounded above by\n", "\n", ".. math::\n", "\n", " \\left\\Vert J(x_0) \\right\\Vert\n", " &\\leq \\frac{\n", " \\left\\Vert J(x_*)^{-1} \\right\\Vert\n", " }{\n", " 1 -\n", " \\left\\Vert J(x_*)^{-1} \\left( J(x_0) - J(x_*) \\right) \\right\\Vert\n", " }\n", " & \\quad & \\text{perturbation relation 3.1.20}\\\\\n", " &\\leq \\beta \\frac{1}{1 - \\beta \\gamma \\epsilon^\\alpha}\\\\\n", " &\\leq \\beta (1 + \\alpha).\n", "\n", "Observe that the sequence :math:`x_1, x_2, \\ldots` converges to :math:`x_*`\n", "because\n", "\n", ".. math::\n", "\n", " x_1 - x_* &= x_0 - J(x_0)^{-1} F(x_0) - x_*\\\\\n", " &= x_0 - J(x_0)^{-1} \\lbrack F(x_0) - F(x_*) \\rbrack - x_*\n", " & \\quad & F(x_*) = 0\\\\\n", " &= J(x_0)^{-1} \\lbrack F(x_*) - F(x_0) - J(x_0) (x_* - x_0) \\rbrack\\\\\n", " \\left\\Vert x_1 - x_* \\right\\Vert\n", " &= \\left\\Vert\n", " J(x_0)^{-1} \\lbrack F(x_*) - F(x_0) - J(x_0) (x_* - x_0) \\rbrack\n", " \\right\\Vert\\\\\n", " \\left\\Vert x_1 - x_* \\right\\Vert\n", " &\\leq \\left\\Vert J(x_0)^{-1} \\right\\Vert\n", " \\left\\Vert F(x_*) - F(x_0) - J(x_0) (x_* - x_0) \\right\\Vert\n", " & \\quad & \\text{(3.1.10)}\\\\\n", " &\\leq \\beta (1 + \\alpha)\n", " \\left\\Vert x_* - x_0 \\right\\Vert^{1 + \\alpha}\n", " \\frac{\\gamma}{1 + \\alpha}\n", " & \\quad & p = x_* - x_0\\\\\n", " &= \\beta \\gamma \\left\\Vert x_0 - x_* \\right\\Vert^{1 + \\alpha}\n", " & \\quad & \\text{(3.1.1b).}\n", "\n", "To understand why :math:`p = x_* - x_0`, see :ref:`Exercise 4.8 `." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. _dennis1996numerical-ex-5.6:\n", "\n", "Exercise 6\n", "==========\n", "\n", "By definition of differentiability and continuity,\n", "\n", ".. math::\n", "\n", " f(t) = \\frac{\\gamma}{2} t^2 - \\frac{t}{\\beta} + \\frac{\\eta}{\\beta}\n", "\n", "is continuously differentiable :math:`\\forall t \\in \\mathbb{R}`,\n", ":math:`\\gamma, \\eta \\in \\mathbb{R}^+_0`, and :math:`\\beta \\in \\mathbb{R}^+`.\n", "Assuming :math:`t_0 = 0`, :math:`N(t_0, r)` is satisfied\n", ":math:`\\forall r \\in (0, \\infty)`.\n", "\n", "Since :math:`f(t): \\mathbb{R} \\rightarrow \\mathbb{R}`, the vector norm and\n", "induced operator norm reduce respectively to an absolute value and\n", ":math:`J(t) = f'(t) = \\gamma t - \\frac{1}{\\beta}`.\n", "\n", "Notice that :math:`J \\in \\text{Lip}_\\gamma(N(t_0, r))` because\n", ":math:`\\left\\vert f'(t) \\right\\vert <\n", "\\left\\vert \\gamma r \\right\\vert + \\left\\vert \\frac{1}{\\beta} \\right\\vert`.\n", "\n", "The initial conditions for the induced operator norms\n", "\n", ".. math::\n", "\n", " \\left\\Vert J(t_0)^{-1} \\right\\Vert =\n", " \\left\\vert \\left( \\gamma t_0 - \\frac{1}{\\beta} \\right)^{-1} \\right\\vert =\n", " \\beta\n", "\n", ".. math::\n", "\n", " \\left\\Vert J(t_0)^{-1} F(t_0) \\right\\Vert =\n", " \\left\\vert\n", " \\left( \\gamma t_0 - \\frac{1}{\\beta} \\right)^{-1}\n", " \\left(\n", " \\frac{\\gamma}{2} t_0^2 - \\frac{t_0}{\\beta} + \\frac{\\eta}{\\beta}\n", " \\right)\n", " \\right\\vert =\n", " \\eta\n", "\n", "are satisfied by definition of :math:`f(t)`.\n", "\n", "Define :math:`\\gamma_\\text{Rel} = \\beta \\gamma, \\alpha = \\gamma_\\text{Rel} \\eta`\n", "for ease of notation. Analytically solving for the roots of the univariate\n", "quadratic function :math:`f(t_*) = 0` gives\n", "\n", ".. math::\n", "\n", " t_*\n", " &= \\frac{\n", " \\beta^{-1} \\pm \\sqrt{\\beta^{-2} - 2 \\gamma \\eta \\beta^{-1}}\n", " }{\\gamma}\\\\\n", " &= \\frac{\\beta^{-1} \\pm \\sqrt{\\frac{1 - 2 \\alpha}{\\beta^2}}}{\\gamma}\\\\\n", " &= \\frac{1 \\pm \\sqrt{1 - 2 \\alpha}}{\\beta \\gamma}.\n", "\n", "Thus :math:`\\alpha \\leq \\frac{1}{2}` is a necessary condition for :math:`f(t)`\n", "to have real roots.\n", "\n", "The following results will be useful in proving :math:`\\{t_k\\}` converges.\n", "\n", "Proof of :math:`\\left\\vert f'(t_{k + 1}) \\right\\vert \\geq\n", "\\left\\vert \\frac{f'(t_k)}{2} \\right\\vert` via induction. The following shows\n", "that the base case holds:\n", "\n", ".. math::\n", "\n", " \\left\\vert f'(t_1) \\right\\vert\n", " &= \\left\\vert \\gamma t_1 - \\frac{1}{\\beta}\\right\\vert\\\\\n", " &= \\left\\vert\n", " \\gamma \\left( t_0 - J(t_0)^{-1} F(t_0) \\right) - \\frac{1}{\\beta}\n", " \\right\\vert\\\\\n", " &= \\left\\vert \\gamma \\eta - \\frac{1}{\\beta}\\right\\vert\\\\\n", " &= \\left\\vert \\frac{\\alpha - 1}{\\beta}\\right\\vert\\\\\n", " &= \\frac{1 - \\alpha}{\\beta}\\\\\n", " &\\geq \\left\\vert \\frac{f'(t_0)}{2} \\right\\vert\n", " & \\quad & \\frac{1}{2 \\beta} \\leq\n", " \\frac{1 - \\alpha}{\\beta} \\leq \\frac{1}{\\beta}\n", "\n", "Assume :math:`\\left\\vert f'(t_k) \\right\\vert \\geq\n", "\\left\\vert \\frac{f'(t_{k - 1})}{2} \\right\\vert` holds. The inductive step holds\n", "because\n", "\n", ".. math::\n", "\n", " \\left\\vert f'(t_{k + 1}) \\right\\vert\n", " &= \\left\\vert \\gamma t_{k + 1} - \\frac{1}{\\beta}\\right\\vert\\\\\n", " &= \\left\\vert\n", " \\gamma \\left( t_k - J(t_k)^{-1} F(t_k) \\right) - \\frac{1}{\\beta}\n", " \\right\\vert\\\\\n", " &= \\left\\vert\n", " \\gamma \\left(\n", " t_k -\n", " \\left( \\frac{\\beta \\gamma t_k - 1}{\\beta} \\right)^{-1}\n", " \\left(\n", " \\frac{\\beta \\gamma t_k^2 - 2 t_k + 2 \\eta}{2 \\beta}\n", " \\right)\n", " \\right) -\n", " \\frac{1}{\\beta}\n", " \\right\\vert\\\\\n", " &= \\left\\vert\n", " \\gamma t_k -\n", " \\frac{\n", " \\frac{\\beta \\gamma^2 t_k^2}{2} - \\gamma t_k + \\gamma \\eta\n", " }{\\beta \\gamma t_k - 1} -\n", " \\frac{1}{\\beta}\n", " \\right\\vert\\\\\n", " &= \\left\\vert\n", " \\gamma t_k -\n", " \\frac{\n", " \\frac{\\gamma_\\text{Rel}^2 t_k^2}{2} -\n", " \\gamma_\\text{Rel} t_k + \\alpha + \\gamma_\\text{Rel} t_k - 1\n", " }{\\beta \\gamma_\\text{Rel} t_k - \\beta}\n", " \\right\\vert\\\\\n", " &= \\left\\vert\n", " \\gamma t_k +\n", " \\frac{\n", " 1 - \\alpha - \\frac{\\gamma_\\text{Rel}^2 t_k^2}{2}\n", " }{\\beta \\gamma_\\text{Rel} t_k - \\beta}\n", " \\right\\vert\\\\\n", " &= \\left\\vert\n", " \\gamma t_k +\n", " \\frac{\n", " 1 - \\alpha -\n", " \\left( \\beta^2 \\gamma t_k - \\beta \\right) \\frac{\\gamma t_k}{2} -\n", " \\frac{\\beta \\gamma t_k}{2}\n", " }{\\beta^2 \\gamma t_k - \\beta}\n", " \\right\\vert\\\\\n", " &= \\left\\vert\n", " \\frac{\\gamma t_k}{2} -\n", " \\frac{\n", " 1 - \\alpha - \\frac{\\gamma_\\text{Rel} t_k}{2}\n", " }{\\beta \\left( 1 - \\gamma_\\text{Rel} t_k \\right)}\n", " \\right\\vert\\\\\n", " &= \\left\\vert\n", " \\frac{\\gamma t_k}{2} -\n", " \\frac{1}{2 \\beta}\n", " \\frac{\n", " 2 - 2 \\alpha - \\gamma_\\text{Rel} t_k\n", " }{1 - \\gamma_\\text{Rel} t_k}\n", " \\right\\vert\\\\\n", " &\\geq \\left\\vert \\frac{\\gamma t_k}{2} - \\frac{1}{2 \\beta} \\right\\vert\\\\\n", " &= \\left\\vert \\frac{f'(t_k)}{2} \\right\\vert.\n", "\n", "The last step can be proven via contradiction. Assume\n", ":math:`0 \\leq \\alpha \\leq \\frac{1}{2}` and\n", "\n", ".. math::\n", "\n", " \\left\\vert \\gamma_\\text{Rel} t_k - 1 \\right\\vert >\n", " \\left\\vert\n", " \\gamma_\\text{Rel} t_k -\n", " \\frac{2 - 2 \\alpha - \\gamma_\\text{Rel} t_k}{1 - \\gamma_\\text{Rel} t_k}\n", " \\right\\vert.\n", "\n", "Given :math:`x = \\gamma_\\text{Rel} t_k`,\n", "\n", ".. math::\n", "\n", " \\left\\vert x - 1 \\right\\vert\n", " &> \\left\\vert x - \\frac{2 - 2 \\alpha - x}{1 - x} \\right\\vert\\\\\n", " \\left\\vert x - 1 \\right\\vert \\left\\vert 1 - x \\right\\vert\n", " &> \\left\\vert x - x^2 - 2 + 2 \\alpha + x \\right\\vert\\\\\n", " \\left\\vert x - 1 \\right\\vert^4\n", " &> \\left\\vert x - x^2 - 2 + 2 \\alpha + x \\right\\vert^2\\\\\n", " (4 \\alpha - 2) x^2 + (4 - 8 \\alpha) x - 4 \\alpha^2 + 8 \\alpha - 3 &> 0\\\\\n", " (2 \\alpha - 1) (2x^2 - 4x + 3 - 2 \\alpha) &> 0.\n", "\n", "Solving for roots of the quadratic equation yields\n", "\n", ".. math::\n", "\n", " x =\n", " \\frac{-(-4) \\pm \\sqrt{(-4)^2 - 4 (2) (3 - 2 \\alpha)}}{2 (2)} =\n", " 1 \\pm \\sqrt{\\alpha - \\frac{1}{2}}\n", "\n", "which means :math:`\\alpha > \\frac{1}{2}` must hold to yield a real solution, but\n", "this contradicts the assumption that :math:`\\alpha \\leq \\frac{1}{2}`. Therefore\n", "\n", ".. math::\n", "\n", " \\left\\vert \\gamma_\\text{Rel} t_k - 1 \\right\\vert\n", " \\leq \\left\\vert\n", " \\gamma_\\text{Rel} t_k -\n", " \\frac{\n", " 2 - 2 \\alpha - \\gamma_\\text{Rel} t_k\n", " }{1 - \\gamma_\\text{Rel} t_k}\n", " \\right\\vert\n", " \\quad \\text{and} \\quad\n", " \\left\\vert f'(t_{k + 1}) \\right\\vert\n", " \\geq \\left\\vert 2^{-(k + 1)} \\beta^{-1} \\right\\vert.\n", "\n", "The upper bound :math:`\\left\\vert f'(t_{k + 1}) \\right\\vert \\leq\n", "\\left\\vert f'(t_k) \\right\\vert` can also be derived as\n", "\n", ".. math::\n", "\n", " 2 \\left\\vert x - 1 \\right\\vert\n", " &\\geq \\left\\vert x - \\frac{2 - 2 \\alpha - x}{1 - x} \\right\\vert\\\\\n", " 3x^4 - 12x^3 + (16 + 4 \\alpha)x^2 - (8 + 8 \\alpha)x +\n", " 8 \\alpha - 4 \\alpha^2 &\\geq 0\\\\\n", " (x^2 - 2x + 2 \\alpha) (3x^2 - 6x + 4 - 2 \\alpha) &\\geq 0.\n", "\n", "Solving for roots of the quartic equation yields\n", "\n", ".. math::\n", "\n", " x = \\frac{-(-2) \\pm \\sqrt{(-2)^2 - 4 (1) (2 \\alpha)}}{2 (1)} =\n", " 1 \\pm \\sqrt{1 - 2 \\alpha}\n", "\n", "and\n", "\n", ".. math::\n", "\n", " x = \\frac{-(-6) \\pm \\sqrt{(-6)^2 - 4 (3) (4 - 2 \\alpha)}}{2 (3)} =\n", " 1 \\pm \\sqrt{\\frac{2 \\alpha - 1}{3}}.\n", "\n", "Since :math:`0 \\leq \\alpha \\leq \\frac{1}{2}`, the second quadratic formula is\n", "inconsequential for real roots. Therefore the sequence :math:`\\{ t_k \\}` is\n", "well-defined with :math:`N(t_0, r)` because\n", "\n", ".. math::\n", "\n", " t_{k + 1} - t_* &= t_k - J(t_k)^{-1} F(t_k) - t_*\\\\\n", " &= J(t_k)^{-1} \\left[ F(t_*) - F(t_k) - J(t_k) (t_* - t_k) \\right]\\\\\n", " \\left\\vert t_{k + 1} - t_* \\right\\vert\n", " &\\leq 2^k \\beta \\frac{\\gamma}{2} \\left\\vert t_* - t_k \\right\\vert^2\n", " & \\quad & \\text{Lemma 4.1.12}\\\\\n", " &= 2^{k - 1} \\beta \\gamma \\left\\vert t_k - t_* \\right\\vert^2\n", " & \\quad & \\text{(3.1.1b)}\\\\\n", " &\\leq 2^{k - 1} r^2 \\frac{\\eta}{\\alpha}.\n", "\n", "This inequality can be further simplified assuming :math:`r \\geq r_0`\n", "\n", ".. math::\n", "\n", " \\left\\vert t_1 - t_* \\right\\vert &\\leq 2^{-1} r^2 \\frac{\\eta}{\\alpha}\\\\\n", " &\\leq 2^{-1} \\left( 1 - \\sqrt{1 - 2 \\alpha} \\right)^2 \\frac{\\eta}{\\alpha}.\n", "\n", "Unrolling the recursion for :math:`t_k` yields\n", "\n", ".. math::\n", "\n", " \\left\\vert t_k - t_* \\right\\vert\n", " &\\leq 2^{-k} \\left( 1 - \\sqrt{1 - 2 \\alpha} \\right)^{2^k}\n", " \\frac{\\eta}{\\alpha}\\\\\n", " &\\leq (2 \\alpha)^{2^k} \\frac{\\eta}{\\alpha}.\n", "\n", "The last step can be proven via induction. The base case when :math:`k = 0`\n", "gives :math:`1 - \\sqrt{1 - 2 \\alpha} \\leq 2 \\alpha`. Let us prove this via\n", "contradiction: assume :math:`0 \\leq \\alpha \\leq \\frac{1}{2}` and\n", ":math:`1 - \\sqrt{1 - 2 \\alpha} > 2 \\alpha`.\n", "\n", ".. math::\n", "\n", " 1 - \\sqrt{1 - 2 \\alpha} &> 2 \\alpha\\\\\n", " \\left( 1 - 2 \\alpha \\right)^2 &> \\sqrt{1 - 2 \\alpha}^2\\\\\n", " 4 \\alpha^2 - 4 \\alpha + 1 &> 1 - 2 \\alpha\\\\\n", " 4 \\alpha^2 - 2 \\alpha > 0\\\\\n", " \\alpha (2\\alpha - 1) > 0\n", "\n", "shows that :math:`\\alpha > \\frac{1}{2}` must hold to satisfy the inequality, but\n", "this contradicts the assumption that :math:`\\alpha \\leq \\frac{1}{2}`. Thus the\n", "base case is true.\n", "\n", "For the inductive step, assume\n", ":math:`2^{-k} \\left( 1 - \\sqrt{1 - 2 \\alpha} \\right)^{2^k} \\leq\n", "\\left( 2 \\alpha \\right)^{2^k}` for :math:`0 \\leq \\alpha \\leq \\frac{1}{2}`.\n", "\n", ".. math::\n", "\n", " 2^{-(k + 1)} \\left( 1 - \\sqrt{1 - 2 \\alpha} \\right)^{2^{k + 1}}\n", " &\\leq \\left( 2 \\alpha \\right)^{2^{k + 1}}\\\\\n", " -(k + 1) + 2^{k + 1} \\log_2 \\left( 1 - \\sqrt{1 - 2 \\alpha} \\right)\n", " &\\leq 2^{k + 1} \\log_2 \\left( 2 \\alpha \\right)\\\\\n", " \\frac{-(k + 1)}{2^{k + 1}} + \\log_2 \\left( 1 - \\sqrt{1 - 2 \\alpha} \\right)\n", " &\\leq \\log_2 \\left( 2 \\alpha \\right)\\\\\n", " 2^{\\frac{-(k + 1)}{2^{k + 1}}} \\left( 1 - \\sqrt{1 - 2 \\alpha} \\right)\n", " &\\leq 2 \\alpha\\\\\n", " c \\left( 1 - \\sqrt{1 - 2 \\alpha} \\right)\n", " &\\leq 2 \\alpha\n", " & \\quad & c = 2^{\\frac{-(k + 1)}{2^{k + 1}}} > 0.\n", "\n", "By inspection, as :math:`k \\rightarrow \\infty`, :math:`c \\rightarrow 1`,\n", "which completes the inductive step." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. _dennis1996numerical-ex-5.7:\n", "\n", "Exercise 7\n", "==========\n", "\n", ":math:`\\left\\Vert J(x_n)^{-1} \\right\\Vert \\leq \\left\\vert f'(t_n)^{-1} \\right\\vert`\n", "-----------------------------------------------------------------------------------\n", "\n", "When :math:`n = 0`, :ref:`Exercise 6 ` shows that\n", ":math:`\\left\\vert f'(t_0)^{-1} \\right\\vert = \\beta` and :math:`F` is presumed to\n", "satisfy the assumptions of the Kantorovich theorem\n", "\n", ".. math::\n", "\n", " \\left\\Vert J(x_0)^{-1} \\right\\Vert \\leq \\left\\vert f'(t_0)^{-1} \\right\\vert.\n", "\n", "For the inductive step :math:`n = k`, recall Theorem 3.14 states that\n", ":math:`J(x_k)` is nonsingular when\n", ":math:`\\left\\Vert J(x_0)^{-1} \\left( J(x_k) - J(x_0) \\right) \\right\\Vert < 1`\n", "holds.\n", "\n", ".. math::\n", "\n", " \\left\\Vert J(x_0)^{-1} \\left( J(x_k) - J(x_0) \\right) \\right\\Vert\n", " &\\leq \\left\\Vert\n", " J(x_0)^{-1} \\right\\Vert \\left\\Vert J(x_k) - J(x_0)\n", " \\right\\Vert\\\\\n", " &\\leq \\beta \\gamma r\n", " & \\quad & J \\in \\text{Lip}_\\gamma(N(x_0, r > 0))\n", "\n", "Picking :math:`r = \\frac{1 - \\sqrt{1 - 2 \\alpha}}{\\beta \\gamma}` where\n", ":math:`0 \\leq \\alpha \\leq \\frac{1}{2}` satisfies Theorem 3.14. Thus\n", ":math:`J(x_k)` is nonsingular and its induced matrix norm is bounded above by\n", "\n", ".. math::\n", "\n", " \\left\\Vert J(x_k)^{-1} \\right\\Vert\n", " &\\leq \\frac{\n", " \\left\\Vert J(x_0)^{-1} \\right\\Vert\n", " }{\n", " 1 -\n", " \\left\\Vert J(x_0)^{-1} \\left( J(x_k) - J(x_0) \\right) \\right\\Vert\n", " }\n", " & \\quad & \\text{(3.1.20)}\\\\\n", " &\\leq \\frac{\\left\\vert f'(t_0)^{-1} \\right\\vert}{ 1 - c}\n", " & \\quad & 0 \\leq c = 1 -\n", " \\left\\Vert\n", " J(x_0)^{-1} \\left( J(x_k) - J(x_0) \\right)\n", " \\right\\Vert < 1\\\\\n", " &\\leq \\frac{\\left\\vert f'(t_k)^{-1} \\right\\vert}{1 - c}\\\\\n", " (1 - c) \\left\\Vert J(x_k)^{-1} \\right\\Vert\n", " &\\leq \\left\\vert f'(t_k)^{-1} \\right\\vert.\n", "\n", "The second to last inequality is explained in\n", ":ref:`Exercise 6 `.\n", "\n", ":math:`\\left\\Vert x_{n + 1} - x_n \\right\\Vert \\leq \\left\\vert t_{n + 1} - t_n \\right\\vert`\n", "------------------------------------------------------------------------------------------\n", "\n", "When :math:`n = 0`, :ref:`Exercise 6 ` shows that\n", ":math:`\\left\\Vert J(t_0)^{-1} F(t_0) \\right\\Vert = \\eta` and :math:`F` is\n", "presumed to satisfy the assumptions of the Kantorovich theorem\n", "\n", ".. math::\n", "\n", " \\left\\Vert x_1 - x_0 \\right\\Vert \\leq \\left\\vert t_1 - t_0 \\right\\vert.\n", "\n", "The following relations will be useful for the induction.\n", "\n", ".. math::\n", "\n", " \\left\\Vert F(x_k) \\right\\Vert\n", " &= \\left\\Vert\n", " F(x_k) - F(x_{k - 1}) - J(x_{k - 1}) (x_k - x_{k - 1})\n", " \\right\\Vert\n", " & \\quad & \\text{(4.1.10)}\\\\\n", " &= \\left\\Vert F(x_{k - 1} + p) - F(x_{k - 1}) - J(x_{k - 1}) p \\right\\Vert\n", " & \\quad & p = x_k - x_{k - 1}\\\\\n", " &\\leq \\frac{\\gamma}{2} \\left\\Vert x_k - x_{k - 1} \\right\\Vert^2\n", " & \\quad & \\text{Lemma 4.1.12}\n", "\n", "Applying the same analysis to\n", ":math:`f(t_k) = \\frac{\\gamma}{2} (t_k - t_{k - 1})^2` gives\n", "\n", ".. math::\n", "\n", " \\left\\vert f(t_k) \\right\\vert\n", " &= \\left\\vert\n", " f(t_k) - f(t_{k - 1}) - f'(t_{k - 1}) (t_k - t_{k - 1})\n", " \\right\\vert\n", " & \\quad & \\text{(4.1.10)}\\\\\n", " &= \\left\\vert\n", " \\frac{\\gamma}{2} t_k^2 - \\frac{t_k}{\\beta} + \\frac{\\eta}{\\beta} -\n", " \\frac{\\gamma}{2} t_{k - 1}^2 + \\frac{t_{k - 1}}{\\beta} -\n", " \\frac{\\eta}{\\beta} -\n", " \\left( \\gamma t_{k - 1} - \\frac{1}{\\beta} \\right) (t_k - t_{k - 1})\n", " \\right\\vert\\\\\n", " &= \\left\\vert\n", " \\frac{\\gamma}{2} t_k^2 - \\frac{t_k}{\\beta} -\n", " \\frac{\\gamma}{2} t_{k - 1}^2 + \\frac{t_{k - 1}}{\\beta} -\n", " \\gamma t_{k - 1} t_k + \\gamma t_{k - 1}^2 + \\frac{t_k}{\\beta} -\n", " \\frac{t_{k - 1}}{\\beta}\n", " \\right\\vert\\\\\n", " &= \\left\\vert\n", " \\frac{\\gamma}{2} t_k^2 - \\gamma t_k t_{k - 1} +\n", " \\frac{\\gamma}{2} t_{k - 1}^2\n", " \\right\\vert\\\\\n", " &= \\frac{\\gamma}{2} \\left( t_k - t_{k - 1} \\right)^2.\n", "\n", "The inductive step for :math:`n = k` is\n", "\n", ".. math::\n", "\n", " x_{k + 1} &= x_k - J(x_k)^{-1} F(x_k)\\\\\n", " \\left\\Vert x_{k + 1} - x_k \\right\\Vert\n", " &= \\left\\Vert -J(x_k)^{-1} F(x_k) \\right\\Vert\\\\\n", " &\\leq \\left\\vert f'(t_k)^{-1} \\right\\vert \\left\\Vert F(x_k) \\right\\Vert\\\\\n", " &= \\left\\vert t_{k + 1} - t_k \\right\\vert\n", " \\frac{\\left\\Vert F(x_k) \\right\\Vert}{\\left\\vert f(t_k) \\right\\vert}\n", " & \\quad & \\left\\vert t_{k + 1} - t_k \\right\\vert =\n", " \\left\\vert f'(t_k)^{-1} f(t_k)\\right\\vert\\\\\n", " &\\leq \\left\\vert t_{k + 1} - t_k \\right\\vert\n", " & \\quad & \\text{ratio is less than one by the inductive step.}" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 8\n", "==========\n", "\n", "As stated in the section after Theorem 5.3.1,\n", ":ref:`Exercise 6 ` and\n", ":ref:`Exercise 7 ` shows that\n", ":math:`\\left\\Vert x_* - x_k \\right\\Vert` is bounded above by\n", ":math:`\\left\\vert t_* - t_k \\right\\vert`; hence it is a convergent sequence.\n", "Every convergent sequence is a Cauchy sequence, so :math:`\\{ x_k \\}` is a Cauchy\n", "sequence that will converge to some :math:`x_*`. This technique of proof is\n", "known as majorization where :math:`\\{ t_k \\}` is said to majorize\n", ":math:`\\{ x_k \\}`.\n", "\n", "See :cite:`ortega1968newton` for a very concise and formal proof." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 9\n", "==========\n", "\n", "See :cite:`bryan1968approximate` for an elegant and concise proof that enables\n", "the evaluation of nonlinear integral equations using numerical quadrature." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. _dennis1996numerical-ex-5.10:\n", "\n", "Exercise 10\n", "===========\n", "\n", "The contractive mapping theorem is also known as the Banach fixed-point theorem\n", "or the contraction mapping principle.\n", "\n", "Given :math:`G: D \\rightarrow D` where :math:`D` is a closed subset of\n", ":math:`\\mathbb{R}^n`, for any starting point :math:`x_0 \\in D`, the sequence\n", ":math:`\\{ x_k \\}` generated by :math:`x_{k + 1} = G(x_k)` remains in :math:`D`\n", "for :math:`k = 0, 1, \\ldots`.\n", "\n", "(a)\n", "---\n", "\n", "Observe that recursively applying\n", "\n", ".. math::\n", "\n", " \\left\\Vert x_{k + 1} - x_k \\right\\Vert =\n", " \\left\\Vert G(x_k) - G(x_{k - 1}) \\right\\Vert \\leq\n", " \\alpha \\left\\Vert x_k - x_{k - 1} \\right\\Vert\n", "\n", "yields\n", "\n", ".. math::\n", "\n", " \\left\\Vert x_{k + 1} - x_k \\right\\Vert \\leq\n", " \\alpha^k \\left\\Vert x_1 - x_0 \\right\\Vert\n", "\n", "where :math:`\\alpha \\in [0,1)`.\n", "\n", "For any :math:`j \\geq 0`,\n", "\n", ".. math::\n", "\n", " \\sum_{i = 0}^j \\left\\Vert x_{i + 1} - x_i \\right\\Vert\n", " &\\leq \\left\\Vert x_1 - x_0 \\right\\Vert \\sum_{i = 0}^j \\alpha^i\\\\\n", " &= \\left\\Vert x_1 - x_0 \\right\\Vert \\frac{1 - \\alpha^{j + 1}}{1 - \\alpha}\n", " & \\quad & \\text{geometric series formula}\\\\\n", " &\\leq \\frac{\\left\\Vert x_1 - x_0 \\right\\Vert}{1 - \\alpha}\n", " & \\quad & \\text{as } j \\rightarrow \\infty.\n", "\n", "(b)\n", "---\n", "\n", "Recall that a sequence :math:`\\{ x_i \\}` is called a Cauchy sequence if\n", ":math:`\\forall \\epsilon > 0 \\, \\exists N \\in \\mathbb{N}` such that\n", ":math:`\\forall \\, m, n \\geq N`, then\n", ":math:`\\left\\Vert x_m - x_n \\right\\Vert < \\epsilon`.\n", "\n", "Without loss of generality, let :math:`m \\leq n` with :math:`k = n - m \\geq 0`.\n", "\n", ".. math::\n", "\n", " \\left\\Vert x_m - x_n \\right\\Vert\n", " &= \\left\\Vert x_m - x_{m + k} \\right\\Vert\\\\\n", " &= \\left\\Vert\n", " (x_m - x_{m + 1}) + (x_{m + 1} - x_{m + 2}) + \\ldots +\n", " (x_{m + k - 1} - x_{m + k})\n", " \\right\\Vert\\\\\n", " &\\leq \\left\\Vert x_{m + 1} - x_m \\right\\Vert +\n", " \\left\\Vert x_{m + 2} - x_{m + 1} \\right\\Vert + \\ldots +\n", " \\left\\Vert x_{m + k} - x_{m + k - 1} \\right\\Vert\n", " & \\quad & \\text{triangle inequality (3.1.1b)}\\\\\n", " &\\leq \\left\\Vert x_1 - x_0 \\right\\Vert \\sum_{i = m}^{m + k - 1} \\alpha^i\n", " & \\quad & \\text{(a)}\\\\\n", " &= \\left\\Vert x_1 - x_0 \\right\\Vert\n", " \\frac{\\alpha^m - \\alpha^{m + k}}{1 - \\alpha}\n", " & \\quad & \\text{geometric series formula}\\\\\n", " &\\leq \\left\\Vert x_1 - x_0 \\right\\Vert \\frac{\\alpha^N}{1 - \\alpha}.\n", "\n", "Hence :math:`\\{ x_i \\}` is a Cauchy sequence as long as :math:`N` satisfies\n", "\n", ".. math::\n", "\n", " \\left\\Vert x_1 - x_0 \\right\\Vert \\frac{\\alpha^N}{1 - \\alpha}\n", " &< \\epsilon\\\\\n", " \\alpha^N &< \\epsilon \\frac{1 - \\alpha}{\\left\\Vert x_1 - x_0 \\right\\Vert}\\\\\n", " N &> \\log_{\\alpha}\\left(\n", " \\epsilon \\frac{1 - \\alpha}{\\left\\Vert x_1 - x_0 \\right\\Vert}\n", " \\right).\n", "\n", "The following observations will be useful in proving :math:`G` has a fixpoint\n", ":math:`x_*` where a fixed point (a.k.a. fixpoint, invariant point) of a function\n", "is defined as :math:`G(x_*) = x_*`.\n", "\n", "A function is said to be a contraction map if it satisfies the assumptions of\n", "the Contractive Mapping Theorem. Notice that the inequality (5.3.2) is a\n", "restricted definition of Lipschitz continuity, which means a contraction map is\n", "continuous on :math:`D`.\n", "\n", "As shown in (a), :math:`\\{ x_k \\}` converges to some :math:`x_*` because\n", ":math:`\\lim_{k \\rightarrow \\infty} \\alpha^k = 0`. Therefore,\n", "\n", ".. math::\n", "\n", " \\lim_{k \\rightarrow \\infty} x_{k + 1}\n", " &= \\lim_{k \\rightarrow \\infty} G(x_k)\\\\\n", " &= G(\\lim_{k \\rightarrow \\infty} x_k)\n", " & \\quad & \\text{G is continuous}\\\\\n", " x_* &= G(x_*).\n", "\n", "(c)\n", "---\n", "\n", "Proof by contradiction is the easily method to show that :math:`x_* \\in D` is\n", "unique.\n", "\n", "Assume that :math:`\\exists \\, y_* \\in D` such that :math:`y_* \\neq x_*`.\n", "\n", ".. math::\n", "\n", " \\left\\Vert G(x_*) - G(y_*) \\right\\Vert\n", " &\\leq \\alpha \\left\\Vert x_* - y_* \\right\\Vert\n", " & \\quad & \\text{(5.3.2)}\\\\\n", " \\left\\Vert x_* - y_* \\right\\Vert\n", " &\\leq \\alpha \\left\\Vert x_* - y_* \\right\\Vert\n", " & \\quad & \\text{(b)}\\\\\n", " (1 - \\alpha) \\left\\Vert x_* - y_* \\right\\Vert &\\leq 0.\n", "\n", "Since :math:`\\alpha \\in [0, 1)`, :math:`\\left\\Vert x_* - y_* \\right\\Vert = 0`,\n", "but this contradicts the assumption :math:`y_* \\neq x_*`. Thus a contraction\n", "map has a unique fixed point.\n", "\n", "By definition\n", "\n", ".. math::\n", "\n", " \\left\\Vert x_{k + 1} - x_* \\right\\Vert =\n", " \\left\\Vert G(x_k) - G(x_*) \\right\\Vert \\leq\n", " \\alpha \\left\\Vert x_k - x_* \\right\\Vert,\n", "\n", ":math:`\\{ x_k \\}` converges q-linearly to :math:`x_*`.\n", "\n", "By applying the results of (b) with :math:`m = k` as\n", ":math:`n \\rightarrow \\infty`, the error bound can be derived as\n", "\n", ".. math::\n", "\n", " \\lim_{n \\rightarrow \\infty} \\left\\Vert x_k - x_n \\right\\Vert\n", " &\\leq \\lim_{n \\rightarrow \\infty} \\left\\Vert x_1 - x_0 \\right\\Vert\n", " \\frac{\\alpha^k - \\alpha^{n}}{1 - \\alpha}\\\\\n", " \\left\\Vert x_k - x_* \\right\\Vert\n", " &\\leq \\frac{\\eta \\alpha^k}{1 - \\alpha}\n", " & \\quad & \\left\\Vert x_1 - x_0 \\right\\Vert \\leq \\eta.\n", "\n", "The last inequality holds because the norm's continuity enables moving the\n", "limit inside.\n", "\n", "(d)\n", "---\n", "\n", "As shown in :cite:`petersd466fp`, the other version of the Contractive Mapping\n", "Theorem no longer assumes a self-map.\n", "\n", "Let :math:`G \\colon D \\rightarrow \\mathbb{R}^n` where :math:`D` is a closed\n", "subset of :math:`\\mathbb{R}^n`. If for some norm\n", ":math:`\\left\\Vert \\cdot \\right\\Vert`, there exists :math:`\\alpha \\in [0, 1)`\n", "such that\n", "\n", ".. math::\n", "\n", " \\left\\Vert G(x) - G(y) \\right\\Vert \\leq \\alpha \\left\\Vert x - y \\right\\Vert\n", " \\qquad \\forall \\, x, y \\in D,\n", "\n", "then the rest of the Contractive Mapping Theorem still holds. The goal is to\n", "show the sequence :math:`\\{ x_k \\}` generated by :math:`x_{k + 1} = G(x_k)`\n", "remains in :math:`D` for :math:`k = 0, 1, \\ldots` with :math:`x_0 \\in D`.\n", "\n", "Since :math:`D` is a closed subset and the results of (c) defines a closed\n", "neighborhood of radius :math:`\\frac{\\eta \\alpha^k}{1 - \\alpha}` around\n", ":math:`x_*`, as long as each subsequent element in the sequence are within that\n", "iteration's specified neighborhood, the original assumptions of the Contractive\n", "Mapping Theorem are satisfied and the sequence will converge to :math:`x_*`." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 11\n", "===========\n", "\n", "Define :math:`x_{i + 1} = G(x_i) = x_i - \\frac{f(x_i)}{a}` where\n", ":math:`f(x) = x^2 - 1` and :math:`a > 1`. By inspection, the roots of :math:`f`\n", "are at :math:`x = \\pm 1` and :math:`G \\colon \\mathbb{R} \\rightarrow \\mathbb{R}`,\n", "but :math:`x_i` will not converge to the roots\n", ":math:`\\forall x_0 \\in \\mathbb{R}`. The goal is to find\n", ":math:`D \\in \\mathbb{R}` such that :math:`x_0 \\in D` will converge to\n", ":math:`x_* = 1` and\n", "\n", ".. math::\n", "\n", " \\left\\Vert G(x_0) - G(x_*) \\right\\Vert\n", " &= \\left\\vert x_0 - \\frac{f(x_0)}{a} - x_*\\right\\vert\\\\\n", " &= \\left\\vert x_0 - x_* - \\frac{x_0^2 - 1}{a} \\right\\vert\\\\\n", " &= \\left\\vert x_0 - x_* \\right\\vert\n", " \\left\\vert 1 - \\frac{x_0 + x_*}{a} \\right\\vert\n", " & \\quad & x_* = 1\\\\\n", " &< \\left\\vert x_0 - x_* \\right\\vert.\n", "\n", "In order to establish the last inequality, :math:`x_0` needs to satisfy\n", "\n", ".. math::\n", "\n", " \\left\\vert 1 - \\frac{x_0 + x_*}{a} \\right\\vert &< 1\\\\\n", " 1 - \\frac{2 (x_0 + x_*)}{a} + \\frac{(x_0 + x_*)^2}{a^2} &< 1\\\\\n", " 0 &< -(x_0 + x_*)^2 + 2a (x_0 + x_*)\\\\\n", " 0 &< -x_0^2 + (2a - 2x_*) x_0 + 2ax_* - x_*^2.\n", "\n", "Solving the quadratic equation for the :math:`x_0` yields\n", "\n", ".. math::\n", "\n", " x_0 =\n", " \\frac{\n", " -(2a - 2x_*) \\pm \\sqrt{(2a - 2x_*)^2 - 4 (-1) (2ax_* - x_*^2)}\n", " }{2 (-1)} =\n", " a - x_* \\pm (-a).\n", "\n", "Hence the inequality will be satisfied i.e. :math:`\\exists \\, \\alpha \\in [0, 1)`\n", "such that :math:`\\left\\Vert G(x_0) - G(x_*) \\right\\Vert\n", "\\leq \\alpha \\left\\vert x_0 - x_* \\right\\vert` when\n", ":math:`x_0 \\in (-x_*, 2a - x_*)`.\n", "\n", "To show :math:`G \\colon D \\rightarrow D` holds by principle of induction, assume\n", "\n", ".. math::\n", "\n", " \\left\\Vert x_i - x_* \\right\\Vert =\n", " \\left\\Vert G(x_{i - 1}) - G(x_*) \\right\\Vert \\leq\n", " \\alpha \\left\\vert x_{i - 1} - x_* \\right\\vert.\n", "\n", "and\n", "\n", ".. math::\n", "\n", " \\left\\Vert G(x_i) - G(x_*) \\right\\Vert\n", " &= \\left\\vert x_i - \\frac{f(x_i)}{a} - x_* \\right\\vert\\\\\n", " &= \\left\\vert x_i - x_* - \\frac{(x_i + x_*) (x_i - x_*)}{a} \\right\\vert\n", " & \\quad & x_* = 1\\\\\n", " &= \\left\\vert x_i - x_* \\right\\vert\n", " \\left\\vert 1 - \\frac{(x_i + x_*)}{a} \\right\\vert\\\\\n", " &< \\left\\vert x_i - x_* \\right\\vert.\n", "\n", "Consequently,\n", "\n", ".. math::\n", "\n", " \\left\\vert 1 - \\frac{(x_i + x_*)}{a} \\right\\vert &< 1\\\\\n", " 1 - \\frac{2 (x_i + x_*)}{a} + \\frac{(x_i + x_*)^2}{a^2} &< 1\\\\\n", " 0 &< -(x_i + x_*)^2 + 2a (x_i + x_*)\\\\\n", " 0 &< -x_i^2 + (2a - 2x_*) x_i + 2ax_* - x_*^2\\\\\n", " 0 &< -\\left( x_{i - 1} - \\frac{x_{i - 1}^2 - 1}{a} \\right)^2 +\n", " (2a - 2x_*) \\left( x_{i - 1} - \\frac{x_{i - 1}^2 - 1}{a} \\right) +\n", " 2ax_* - x_*^2\\\\\n", " 0 &< -\\frac{\\left( -ax_{i - 1} - ax_* + x_{i - 1}^2 - 1 \\right)\n", " \\left( 2a^2 - ax_{i - 1} - ax_* + x_{i - 1}^2 - 1 \\right)}{a^2}\\\\\n", " 0 &< \\frac{ \\left( x_{i - 1} + 1 \\right) \\left( a - x_{i - 1} + 1 \\right)\n", " \\left( 2a^2 - ax_{i - 1} - a + x_{i - 1}^2 - 1 \\right)}{a^2}\n", " & \\quad & x_* = 1.\n", "\n", "Since :math:`a > 1`, :math:`\\left\\Vert G(x_i) - G(x_*) \\right\\Vert\n", "\\leq \\alpha \\left\\vert x_i - x_* \\right\\vert` when\n", ":math:`x_{i - 1} \\in (-1, a + 1)`.\n", "\n", "Thus defining :math:`\\delta = \\min\\left\\{ 2, a + 1, 2a - 1 \\right\\}` and\n", "applying the results of :ref:`Exercise 10d ` proves\n", "that the sequence :math:`\\{ x_i \\}` converges q-linearly to :math:`x_* = 1` with\n", "\n", ".. math::\n", "\n", " \\lim_{i \\rightarrow \\infty}\n", " \\left\\vert \\frac{x_{i + 1} - x_*}{x_i - x_*} \\right\\vert\n", " &= \\lim_{i \\rightarrow \\infty}\n", " \\left\\vert\n", " \\frac{x_i - \\frac{x_i^2 - 1}{a} - x_*}{x_i - x_*}\n", " \\right\\vert\\\\\n", " &= \\lim_{i \\rightarrow \\infty}\n", " \\left\\vert 1 - \\frac{x_i^2 - 1}{a (x_i - x_*)} \\right\\vert\\\\\n", " &= \\lim_{i \\rightarrow \\infty} \\left\\vert 1 - \\frac{x_i + 1}{a} \\right\\vert\n", " & \\quad & x_* = 1\\\\\n", " &= \\left\\vert 1 - \\lim_{i \\rightarrow \\infty} \\frac{x_i + 1}{a} \\right\\vert\n", " & \\quad & \\text{absolute value is continuous}\\\\\n", " &= \\left\\vert 1 - \\frac{2}{a} \\right\\vert\\\\\n", " &> 0 & \\quad & \\text{unless } a = 2." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 12\n", "===========\n", "\n", "If :math:`\\lim_{k \\rightarrow \\infty} h_k = 0`, then the convergence is\n", "q-superlinear.\n", "\n", "The proof is similar to the one in the text up to applying Lemma 4.2.1.\n", "\n", ".. math::\n", "\n", " \\left\\Vert x_k - x_* \\right\\Vert\n", " &\\leq \\left\\Vert A^{-1}_k \\right\\Vert\n", " \\left(\n", " \\left\\Vert F(x_*) - F(x_k) - J(x_k) (x_* - x_k) \\right\\Vert +\n", " \\left\\Vert A_k - J(x_k) \\right\\Vert \\left\\Vert x_* - x_k \\right\\Vert\n", " \\right)\\\\\n", " &\\leq 2 \\beta\n", " \\left(\n", " \\frac{\\gamma}{2} \\left\\Vert x_* - x_k \\right\\Vert^2 +\n", " \\frac{\\gamma \\left\\vert h_k \\right\\vert}{2}\n", " \\left\\Vert x_* - x_k \\right\\Vert\n", " \\right)\n", " & \\quad & \\text{Lemma 4.2.1}\\\\\n", " &= c_k \\left\\Vert x_* - x_k \\right\\Vert\n", " & \\quad & c_k = \\beta \\gamma \\left( \\left\\Vert x_* - x_k \\right\\Vert +\n", " \\left\\vert h_k \\right\\vert \\right).\n", "\n", "Since :math:`\\lim_{k \\rightarrow \\infty} x_k = x_*` and by assumption\n", ":math:`\\lim_{k \\rightarrow \\infty} h_k = 0`,\n", ":math:`\\lim_{k \\rightarrow \\infty} c_k = 0` i.e.\n", ":math:`\\{ x_k \\}` converges q-superlinearly.\n", "\n", "Notice that the expressions (5.4.2) and (5.4.3) are equivalent via Lemma 4.1.16:\n", "\n", ".. math::\n", "\n", " \\left\\vert h_k \\right\\vert\n", " &\\leq c_2 \\left\\Vert F(x_k) \\right\\Vert\n", " & \\quad & \\text{(5.4.3)}\\\\\n", " &= c_2 \\left\\Vert F(x_k) - F(x_*) \\right\\Vert\n", " & \\quad & F(x_*) = 0\\\\\n", " &\\leq c_2 \\beta \\left\\Vert x_k - x_* \\right\\Vert\n", " & \\quad & 4.1.16\\\\\n", " &= c_1 \\left\\Vert x_k - x_* \\right\\Vert\n", " & \\quad & c_1 = c_2 \\beta.\n", "\n", "To show q-quadratic convergence, the proof consists of applying the assumption\n", "that either (5.4.2) or (5.4.3) holds." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 13\n", "===========\n", "\n", "Given :math:`x_k = \\begin{pmatrix} 10^7 & 10^{-7} \\end{pmatrix}^\\top`,\n", "the analytical solution for :math:`J(x_k)` is\n", "\n", ".. math::\n", "\n", " J(x_k) = \\nabla F(x_k)^\\top =\n", " \\begin{bmatrix}\n", " 2 x_1 & 0\\\\\n", " 0 & 2 x_2\n", " \\end{bmatrix}.\n", "\n", "Approximating :math:`J(x_k)` with (5.4.1) gives\n", "\n", ".. math::\n", "\n", " A(x_k) =\n", " \\frac{1}{h_k}\n", " \\begin{bmatrix}\n", " \\left( x_1 + h_k \\right)^2 - \\left( x_1 \\right)^2 &\n", " \\left( x_1 \\right)^2 - \\left( x_1 \\right)^2\\\\\n", " \\left( x_2 \\right)^2 - \\left( x_2 \\right)^2 &\n", " \\left( x_2 + h_k \\right)^2 - \\left( x_2 \\right)^2\n", " \\end{bmatrix} =\n", " \\begin{bmatrix}\n", " 2 x_1 + h_k & 0\\\\\n", " 0 & 2 x_2 + h_k\n", " \\end{bmatrix}.\n", "\n", "When :math:`h_k = 1`, the approximation\n", "\n", ".. math::\n", "\n", " A(x_k) = \\begin{bmatrix} 2 x_1 + 1 & 0\\\\ 0 & 2 x_2 + 1 \\end{bmatrix}\n", "\n", "has a relative error of\n", "\n", ".. math::\n", "\n", " \\frac{\n", " \\left\\vert 2 x_1 - \\left( 2 x_1 + 1 \\right) \\right\\vert\n", " }{\\left\\vert 2 x_1 \\right\\vert} =\n", " \\frac{1}{2} 10^{-7}\n", " \\quad \\text{and} \\quad\n", " \\frac{\n", " \\left\\vert 2 x_2 - \\left( 2 x_2 + 1 \\right) \\right\\vert\n", " }{\\left\\vert 2 x_2 \\right\\vert} =\n", " \\frac{1}{2} 10^7.\n", "\n", "When :math:`h_k = 10^{-14}`, the approximation\n", "\n", ".. math::\n", "\n", " A(x_k) =\n", " \\begin{bmatrix} 2 x_1 + 10^{-14} & 0\\\\ 0 & 2 x_2 + 10^{-14} \\end{bmatrix}\n", "\n", "has a relative error of\n", "\n", ".. math::\n", "\n", " \\frac{\n", " \\left\\vert 2 x_1 - \\left( 2 x_1 + 10^{-14} \\right) \\right\\vert\n", " }{\\left\\vert 2 x_1 \\right\\vert} =\n", " \\frac{1}{2} 10^{-21}\n", " \\quad \\text{and} \\quad\n", " \\frac{\n", " \\left\\vert 2 x_2 - \\left( 2 x_2 + 10^{-14} \\right) \\right\\vert\n", " }{\\left\\vert 2 x_2 \\right\\vert} =\n", " \\frac{1}{2} 10^{-7}.\n", "\n", "Recall that a computer with :math:`14` base-:math:`10` digits has a machine\n", "epsilon :math:`\\eta \\approx 10^{-14}`. The first approximation incorrectly\n", "computed :math:`J(x_k)_{22}` while the second approximation will experience\n", "underflow during the evaluation of :math:`A(x_k)_{22}`. Setting\n", "\n", ".. math::\n", "\n", " h_k =\n", " \\text{sign}(x_k) \\sqrt{\\eta} \\max\\{ \\vert x_k \\vert, \\text{typ}x_k \\}\n", "\n", "is a good compromise." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 14\n", "===========\n", "\n", "Let :math:`p \\in \\mathbb{R}, p \\neq 0`. The step size proposed in (5.4.10) is\n", "\n", ".. math::\n", "\n", " h_j = \\sqrt{\\eta} \\max\\{ |x_j|, \\text{typ}x_j \\} \\text{sign}(x_j)\n", "\n", "(a)\n", "---\n", "\n", "Let :math:`\\hat{F} = f_1(x) = x^p` and\n", ":math:`\\gamma = f_1''(x) = p (p - 1) x^p`. The optimal step size given in\n", "(5.4.13) is\n", "\n", ".. math::\n", "\n", " h_* =\n", " \\left( \\frac{4 (\\eta + \\epsilon)}{\\gamma} \\hat{F} \\right)^{1/2} =\n", " \\left( \\frac{4 (\\eta + \\epsilon)}{p^2 - p} \\right)^{1/2} =\n", " \\frac{2}{\\sqrt{p^2 - p}} \\sqrt{\\eta + \\epsilon}.\n", "\n", "(b)\n", "---\n", "\n", "Let :math:`\\hat{F} = f_2(x) = e^{px}` and\n", ":math:`\\gamma = f_2''(x) = p^2 e^{px}`. The optimal step size given in\n", "(5.4.13) is\n", "\n", ".. math::\n", "\n", " h_* =\n", " \\left( \\frac{4 (\\eta + \\epsilon)}{\\gamma} \\hat{F} \\right)^{1/2} =\n", " \\left( \\frac{4 (\\eta + \\epsilon)}{p^2} \\right)^{1/2} =\n", " \\frac{2}{p} \\sqrt{\\eta + \\epsilon}.\n", "\n", "(c)\n", "---\n", "\n", "Both step sizes are scale independent, which is not ideal.\n", "\n", "(d)\n", "---\n", "\n", "The forward finite-difference approximation to :math:`f_2'(x) = p e^{px}` is\n", "\n", ".. math::\n", "\n", " \\frac{f_2(x + h) - f_2(x)}{h} = e^{px} \\frac{e^{ph} - 1}{h}\n", "\n", "with a relative error of\n", "\n", ".. math::\n", "\n", " \\frac{\n", " \\left\\vert\n", " p e^{px} - \\left( e^{px} \\frac{e^{ph} - 1}{h} \\right)\n", " \\right\\vert\n", " }{\n", " \\left\\vert p e^{px} \\right\\vert\n", " } =\n", " \\left\\vert 1 - \\frac{e^{ph} - 1}{ph} \\right\\vert.\n", "\n", "Good step sizes will tend to make :math:`e^{ph} \\rightarrow 1`, so the quantity\n", "will either be :math:`0` or a very small constant compared to :math:`e^{px}`.\n", "This means when :math:`x` is a very large positive number, either overflow\n", "occurs or the approximation will be off as described by the relative error." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 15\n", "===========\n", "\n", "Lemma 4.3.1 asserts that a minimizer of\n", "\n", ".. math::\n", "\n", " f(x) = f(x_1, x_2) = \\frac{1}{2} (x_1^2 - x_2)^2 + \\frac{1}{2} (1 - x_1)^2\n", "\n", "needs to satisfy\n", "\n", ".. math::\n", "\n", " \\nabla f(x) =\n", " \\begin{bmatrix}\n", " 2 x_1^3 - 2 x_1 x_2 + x_1 - 1\\\\\n", " x_2 - x_1^2\n", " \\end{bmatrix} = 0.\n", "\n", ":math:`x = [1, 1]` is one such minimizer. By definition 4.1.4, the Hessian is\n", "\n", ".. math::\n", "\n", " \\nabla^2 f(x) =\n", " \\begin{bmatrix} 6 x_1^2 - 2 x_2 + 1 & -2 x_1\\\\ -2 x_1 & 1 \\end{bmatrix} = 0.\n", "\n", "Based on the results of :math:`f(x_0)` and :math:`f(x_1)`, it does seem like a\n", "good step." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import numpy\n", "\n", "def f(v):\n", " x = v.flat\n", " return 0.5 * (x[0]**2 - x[1])**2 + 0.5 * (1 - x[0])**2\n", "\n", "def fp(v):\n", " x = v.flat\n", " return numpy.asmatrix([2 * x[0]**3 - 2 * x[0] * x[1] + x[0] - 1,\n", " x[1] - x[0]**2]).T\n", "\n", "def fpp(v):\n", " x = v.flat\n", " return numpy.asmatrix([[6 * x[0]**2 - 2 * x[1] + 1, -2 * x[0]],\n", " [-2 * x[0], 1]])\n", "\n", "x_k = numpy.asmatrix([2, 2]).T\n", "for i in range(2):\n", " print('x_{0}: {1}'.format(i, f(x_k)))\n", " _ = x_k - numpy.linalg.inv(fpp(x_k)) * fp(x_k)\n", " x_k = _" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 16\n", "===========\n", "\n", "Given :math:`f(x) = \\sin(x)`, :math:`f'(x) = \\cos(x)` and\n", ":math:`f''(x) = -\\sin(x)`.\n", "\n", "Applying Newton's method yields\n", "\n", ".. math::\n", "\n", " x_1 = x_0 - \\frac{\\cos(x_0)}{-sin(x_0)} = x_0 + \\frac{1}{\\tan(x_0)}.\n", "\n", "Define :math:`0 < \\epsilon \\ll 1`. When :math:`x_0 = -\\epsilon`,\n", "\n", ".. math::\n", "\n", " x_1 &= -\\epsilon + \\frac{1}{\\tan(-\\epsilon)}\\\\\n", " &\\cong -\\epsilon - \\frac{1}{\\epsilon}\n", " & \\quad & \\text{small-angle approximation of tangent}\\\\\n", " &= -\\frac{\\epsilon^2 + 1}{\\epsilon}\\\\\n", " &\\cong -\\frac{1}{\\epsilon}.\n", "\n", "When :math:`x_0 = \\epsilon`,\n", "\n", ".. math::\n", "\n", " x_1 &= \\epsilon - \\frac{1}{\\tan(\\epsilon)}\\\\\n", " &\\cong \\epsilon - \\frac{1}{\\epsilon}\n", " & \\quad & \\text{small-angle approximation of tangent}\\\\\n", " &= \\frac{\\epsilon^2 - 1}{\\epsilon}\\\\\n", " &\\cong -\\frac{1}{\\epsilon}." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 17\n", "===========\n", "\n", "(a)\n", "---\n", "\n", "The gradient and Hessian of :math:`f_1(x) = -x_1^2 - x_2^2` are respectively\n", "\n", ".. math::\n", "\n", " \\nabla f_1(x) = \\begin{bmatrix} -2 x_1\\\\ -2 x_2 \\end{bmatrix}\n", " \\quad \\text{and} \\quad\n", " \\nabla^2 f_1(x) = \\begin{bmatrix} -2 & 0\\\\ 0 & -2 \\end{bmatrix}.\n", "\n", "Since :math:`\\nabla^2 f_1(x)` has an eigenvalue :math:`\\lambda = -2` of\n", "multiplicity :math:`2`, it is not positive definite. This means\n", "\n", ".. math::\n", "\n", " H_k = \\nabla^2 f_1(x) + \\mu_k I = (\\mu_k - 2) I\n", " \\quad \\Rightarrow \\quad\n", " H_k^{-1} = \\frac{1}{\\mu_k - 2} I\n", "\n", "and\n", "\n", ".. math::\n", "\n", " x_{k + 1} = x_k - H_k^{-1}(x_k) \\nabla f_1(x_k) =\n", " \\begin{bmatrix}\n", " x_{k_1} + x_{k_1} \\left( \\frac{\\mu_k}{2} - 1 \\right)^{-1}\\\\\n", " x_{k_2} + x_{k_2} \\left( \\frac{\\mu_k}{2} - 1 \\right)^{-1}\n", " \\end{bmatrix}.\n", "\n", "From the assumption of algorithm A5.5.1, the denominator is positive. Thus\n", "the algorithm will grow towards to :math:`\\pm \\infty` depending on\n", ":math:`\\text{sgn}(x_0)` unless :math:`x_0 = x_*`.\n", "\n", "(b)\n", "---\n", "\n", "The gradient and Hessian of :math:`f_2(x) = x_1^2 - x_2^2` are respectively\n", "\n", ".. math::\n", "\n", " \\nabla f_2(x) = \\begin{bmatrix} 2 x_1\\\\ -2 x_2 \\end{bmatrix}\n", " \\quad \\text{and} \\quad\n", " \\nabla^2 f_2(x) = \\begin{bmatrix} 2 & 0\\\\ 0 & -2 \\end{bmatrix}.\n", "\n", "Since the eigenvalues of :math:`\\nabla^2 f_2(x)` are :math:`\\lambda_0 = 2` and\n", ":math:`\\lambda_1 = -2`, it is not positive definite. This means\n", "\n", ".. math::\n", "\n", " H_k = \\nabla^2 f_2(x) + \\mu_k I\n", " \\quad \\Rightarrow \\quad\n", " H_k^{-1} =\n", " \\begin{bmatrix}\n", " \\frac{1}{\\mu_k + 2} & 0\\\\\n", " 0 & \\frac{1}{\\mu_k -2}\n", " \\end{bmatrix}\n", "\n", "and\n", "\n", ".. math::\n", "\n", " x_{k + 1} = x_k - H_k^{-1}(x_k) \\nabla f_1(x_k) =\n", " \\begin{bmatrix}\n", " x_{k_1} - x_{k_1} \\left( \\frac{\\mu_k}{2} + 1 \\right)^{-1}\\\\\n", " x_{k_2} + x_{k_2} \\left( \\frac{\\mu_k}{2} - 1 \\right)^{-1}\n", " \\end{bmatrix}\n", "\n", "From the assumption of algorithm A5.5.1, the denominator is positive. Thus\n", "the algorithm will grow towards to :math:`\\pm \\infty` depending on\n", ":math:`\\text{sgn}(x_0)` unless :math:`x_{0_2} = 0`." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 18\n", "===========" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import numpy\n", "\n", "def f(x):\n", " return 0.5 * (x[0]**2 - x[1])**2 + 0.5 * (1 - x[0])**2\n", "\n", "def fp(v):\n", " x = v.flat\n", " return numpy.asmatrix([2 * x[0]**3 - 2 * x[0] * x[1] + x[0] - 1,\n", " x[1] - x[0]**2]).T\n", "\n", "def fpp(v):\n", " x = v.flat\n", " return numpy.asmatrix([[6 * x[0]**2 - 2 * x[1] + 1, -2 * x[0]],\n", " [-2 * x[0], 1]])\n", "\n", "def afpp(v, root_eta):\n", " x_c = v.flat\n", " h = root_eta * x_c\n", " e = [numpy.asfarray([1, 0]), numpy.asfarray([0, 1])]\n", "\n", " A = numpy.asmatrix([[0.0, 0.0], [0.0, 0.0]])\n", " for i in range(2):\n", " for j in range(2):\n", " A[i,j] = f(x_c + h[i] * e[i] + h[j] * e[j])\n", " A[i,j] -= f(x_c + h[i] * e[i])\n", " A[i,j] -= f(x_c + h[j] * e[j])\n", " A[i,j] += f(x_c)\n", " A[i,j] /= h[i] * h[j]\n", " return A\n", "\n", "x_k = numpy.asmatrix([2.0, 2.0]).T\n", "\n", "solutions = [[], [], []]\n", "\n", "for s in range(3):\n", " x_k = numpy.asmatrix([2.0, 2.0]).T\n", " for i in range(8):\n", " if 0 == s:\n", " A = fpp(x_k)\n", " elif 1 == s:\n", " A = afpp(x_k, numpy.finfo(float).eps**(1/2))\n", " else:\n", " A = afpp(x_k, numpy.finfo(float).eps**(1/3))\n", " _ = x_k - numpy.linalg.inv(A) * fp(x_k)\n", " x_k = _\n", " solutions[s].append(f(x_k).flat[0])\n", "\n", "row_layout = '{0:<25} {1:<25} {2:<25}'\n", "print(row_layout.format('Analytic', 'macheps^(1/2)', 'macheps^(1/3)'))\n", "for i in range(len(solutions[0])):\n", " print(row_layout.format(solutions[0][i], solutions[1][i], solutions[2][i]))" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 19\n", "===========\n", "\n", "Recall that the Hessian of :math:`f \\colon \\mathbb{R}^n \\rightarrow \\mathbb{R}`\n", "is equal to the Jacobian of :math:`\\nabla f`.\n", "\n", ".. math::\n", "\n", " x_+ - x_* &= x_c - A_c^{-1} g_c - x_*\\\\\n", " &= A_c^{-1} [ A_c (x_c - x_*) - g_c ]\\\\\n", " &= A_c^{-1}\n", " \\left[\n", " \\left( A_c - \\nabla^2 f(x_c) \\right) (x_c - x_*) - g_c +\n", " \\nabla^2 f(x_c) (x_c - x_*)\n", " \\right]\\\\\n", " &= A_c^{-1}\n", " \\left[\n", " \\left( A_c - \\nabla^2 f(x_c) \\right) (x_c - x_*) -\n", " (g_c - \\nabla f(x_c)) + \\nabla f(x_*) - \\nabla f(x_c) -\n", " \\nabla^2 f(x_c) (x_* - x_c)\n", " \\right]\\\\\n", " \\left\\Vert x_+ - x_* \\right\\Vert\n", " &= \\left\\Vert A_c^{-1}\n", " \\left[\n", " \\left( A_c - \\nabla^2 f(x_c) \\right) (x_c - x_*) -\n", " (g_c - \\nabla f(x_c)) + \\nabla f(x_*) - \\nabla f(x_c) -\n", " \\nabla^2 f(x_c) (x_* - x_c)\n", " \\right] \\right\\Vert\\\\\n", " \\left\\Vert x_+ - x_* \\right\\Vert\n", " &\\leq \\left\\Vert A_c^{-1} \\right\\Vert\n", " \\left[\n", " \\left\\Vert A_c - \\nabla^2 f(x_c) \\right\\Vert\n", " \\left\\Vert x_c - x_* \\right\\Vert +\n", " \\left\\Vert -(g_c - \\nabla f(x_c)) \\right\\Vert +\n", " \\left\\Vert\n", " \\nabla f(x_*) - \\nabla f(x_c) - \\nabla^2 f(x_c) (x_* - x_c)\n", " \\right\\Vert\n", " \\right]\\\\\n", " \\left\\Vert x_+ - x_* \\right\\Vert\n", " &\\leq \\left\\Vert A_c^{-1} \\right\\Vert\n", " \\left[\n", " \\left\\Vert g_c - \\nabla f(x_c) \\right\\Vert +\n", " \\left\\Vert A_c - \\nabla^2 f(x_c) \\right\\Vert\n", " \\left\\Vert e_c \\right\\Vert +\n", " \\frac{\\gamma}{2} \\left\\Vert e_c^2 \\right\\Vert\n", " \\right]\n", " & \\quad & \\text{Lemma 4.1.12.}" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 20\n", "===========\n", "\n", "A5.6.3, A5.6.4, and A5.6.2 uses :math:`n + 1`, :math:`2n`, and\n", ":math:`1 + 2n + n^2 / 2` function evaluations respectively. The overlap between\n", "each of these algorithms is at most :math:`n` function evaluations. Therefore,\n", "\n", ".. math::\n", "\n", " \\text{A5.6.3} + \\text{A5.6.2} =\n", " (n + 1) + (1 + 2n + n^2 / 2) - n = 2 + 2n + n^2 / 2\n", "\n", "and\n", "\n", ".. math::\n", "\n", " \\text{A5.6.4} + \\text{A5.6.2} =\n", " (2n) + (1 + 2n + n^2 / 2) - n = 1 + 3n + n^2 / 2.\n", "\n", "The combined algorithm may yield answers with less accuracy and possibly lose\n", "cache locality due to more complex code." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. rubric:: References\n", "\n", ".. bibliography:: chapter-05.bib" ] } ], "metadata": { "anaconda-cloud": {}, "celltoolbar": "Raw Cell Format", "kernelspec": { "display_name": "Python [default]", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.2" } }, "nbformat": 4, "nbformat_minor": 0 }