{ "cells": [ { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "*************************************************\n", "Secant Methods for Systems of Nonlinear Equations\n", "*************************************************" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 1\n", "==========\n", "\n", "Given :math:`s_i, y_i \\in \\mathbb{R}^n` for :math:`i = 1, \\ldots, n` where\n", ":math:`s_i` are linearly independent, :math:`A \\in \\mathbb{R}^{n \\times n}` can\n", "be uniquely determined using the secant equations\n", "\n", ".. math::\n", "\n", " \\begin{bmatrix}\n", " A_{1,1} & \\cdots & A_{1, n}\\\\\n", " \\vdots & \\ddots & \\vdots\\\\\n", " A_{n,1} & \\cdots & A_{n, n}\n", " \\end{bmatrix}\n", " \\begin{bmatrix}\n", " s_1 & \\cdots & s_n\n", " \\end{bmatrix}\n", " &= \\begin{bmatrix} y_1 & \\cdots & y_n \\end{bmatrix}\\\\\n", " A S &= Y\\\\\n", " A &= Y S^{-1}.\n", "\n", "This is a bad way to form the model because it only works when :math:`S` is\n", "invertible i.e. the columns of :math:`S` are linearly independent." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 2\n", "==========\n", "\n", "Let :math:`A = \\begin{bmatrix} 4 & 1\\\\ -1 & 1 \\end{bmatrix}`,\n", ":math:`s = \\begin{bmatrix} 1\\\\ 0 \\end{bmatrix}`, and\n", ":math:`y = \\begin{bmatrix} 5\\\\ -2 \\end{bmatrix}`.\n", "\n", "One way to solve this is to invoke Lemma 8.1.1 since its assumptions are\n", "satisfied. The following derivations does not use that lemma.\n", "\n", "Given :math:`M \\in Q(y, s)`, its form must be\n", ":math:`\\begin{bmatrix} 5 & a\\\\ -2 & b\\end{bmatrix}` where\n", ":math:`a, b \\in \\mathbb{R}`. The minimization problem becomes\n", "\n", ".. math::\n", "\n", " \\min_{M \\in Q(y, s)} \\left\\Vert M - A \\right\\Vert =\n", " \\min_{a, b} \\left\\Vert\n", " \\begin{bmatrix} 1 & a - 1\\\\ -1 & b - 1 \\end{bmatrix}\n", " \\right\\Vert.\n", "\n", "(a)\n", "---\n", "\n", "Minimizing over the :math:`l_2` operator norm (3.1.8b) gives\n", "\n", ".. math::\n", "\n", " \\left\\Vert M - A \\right\\Vert_2\n", " &= \\max_{\\sqrt{\\lambda}}\n", " \\begin{bmatrix} 2 & a - b\\\\ a - b & (a - 1)^2 + (b - 1)^2 \\end{bmatrix}\\\\\n", " &= \\max_{\\lambda} \\left\\{\n", " \\frac{a^2}{2} - a + \\frac{b^2}{2} - b + 2 \\pm \\frac{1}{2}\n", " \\sqrt{\\left( a^2 + b^2 \\right) \\left( a^2 - 4a + b^2 - 4b + 8 \\right)}\n", " \\right\\}^{1/2}\\\\\n", " &= \\left(\n", " \\frac{a^2}{2} - a + \\frac{b^2}{2} - b + 2 + \\frac{1}{2}\n", " \\sqrt{\\left( a^2 + b^2 \\right) \\left( a^2 - 4a + b^2 - 4b + 8 \\right)}\n", " \\right)^{1/2}\\\\\n", " &= \\frac{1}{\\sqrt{2}} \\left(\n", " (a - 1)^2 + (b - 1)^2 + 2 +\n", " \\sqrt{a^2 + b^2} \\sqrt{(a - 2)^2 + (b - 2)^2}\n", " \\right)^{1/2}\\\\\n", " &= \\frac{1}{\\sqrt{2}} \\left(\n", " 2 (\\beta - 1)^2 + 2 + 2 \\sqrt{\\beta^2 \\left( \\beta - 2 \\right)^2}\n", " \\right)^{1/2} \\qquad\n", " & \\quad & a = b = \\beta\\\\\n", " &= \\left(\n", " (\\beta - 1)^2 + 1 + \\sqrt{\\beta^2 \\left( \\beta - 2 \\right)^2}\n", " \\right)^{1/2}.\n", "\n", "Plotting the last equation shows that the function is at a global minimum\n", ":math:`\\sqrt{2}` when :math:`\\beta \\in [0, 2]`.\n", "\n", "Replacing :math:`M` with\n", ":math:`A_+ = A + \\begin{bmatrix} 1 & \\alpha\\\\ -1 & \\alpha \\end{bmatrix}` yields\n", "\n", ".. math::\n", "\n", " \\left\\Vert A_+ - A \\right\\Vert_2\n", " &= \\max_{\\sqrt{\\lambda}}\n", " \\begin{bmatrix} 2 & 0\\\\ 0 & 2 \\alpha^2 \\end{bmatrix}\\\\\n", " &= \\max_{\\lambda} \\left\\{\n", " \\alpha^2 + 1 \\pm \\sqrt{(\\alpha - 1)^2 (\\alpha + 1)^2}\n", " \\right\\}^{1/2}\\\\\n", " &= \\left(\n", " \\alpha^2 + 1 + \\sqrt{(\\alpha - 1)^2 (\\alpha + 1)^2}\n", " \\right)^{1/2}\\\\\n", " &= \\left(\n", " (\\alpha - 1) (\\alpha + 1) + 2 + \\sqrt{(\\alpha - 1)^2 (\\alpha + 1)^2}\n", " \\right)^{1/2}.\n", "\n", "Plotting the last equation shows that the function is at a global minimum\n", ":math:`\\sqrt{2}` when :math:`\\alpha \\in [-1, 1]`.\n", "\n", "(b)\n", "---\n", "\n", "Minimizing over the Frobenius matrix norm (3.1.6) yields\n", "\n", ".. math::\n", "\n", " \\left\\Vert M - A \\right\\Vert_F =\n", " \\sqrt{1^2 + (a - 1)^2 + (b - 1)^2 + (-1)^2}\n", "\n", "and\n", "\n", ".. math::\n", "\n", " \\left\\Vert A_+ - A \\right\\Vert_F = \\sqrt{1^2 + 2 \\alpha^2 + (-1)^2}\n", "\n", "with both achieving a global minimum :math:`\\sqrt{2}` when :math:`a = b = 1` and\n", ":math:`\\alpha = 0` respectively.\n", "\n", "(c)\n", "---\n", "\n", "If :math:`F(x)` is linear in :math:`x_2`, then the Frobenius matrix norm should\n", "be used because there is a single minimum which serves to isolate the changes\n", "in :math:`x_2`." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 3\n", "==========\n", "\n", "Even though Broyden's method converges to a root, the relative error with\n", "respect to :math:`\\frac{\\partial f_2}{\\partial x_1}` is huge; the others are\n", "within :math:`[0.0, 0.5]` with :math:`0.0` applicable only to the linear\n", "function :math:`f_1`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import numpy\n", "\n", "def broydens_update(F, A_k, x_k):\n", " s_k = numpy.linalg.solve(A_k, -F(x_k))\n", " x_next = x_k + s_k\n", " y_k = F(x_next) - F(x_k)\n", " A_next = A_k + (y_k - A_k * s_k) * s_k.T / (s_k.T * s_k)[0]\n", " return A_next, x_next\n", "\n", "def F(x):\n", " x = x.flat\n", " return numpy.asmatrix([[x[0] + x[1] - 3],\n", " [x[0]**2 + x[1]**2 - 9]])\n", "\n", "def J(x):\n", " x = x.flat\n", " return numpy.asmatrix([[1, 1],\n", " [2 * x[0], 2 * x[1]]])\n", "\n", "macheps = numpy.sqrt(numpy.finfo(numpy.float64).eps)\n", "x_k = numpy.asmatrix([[2], [7]])\n", "A_k = J(x_k)\n", "for i in range(16):\n", " _ = broydens_update(F, A_k, x_k)\n", " norm = numpy.linalg.norm(_[1] - x_k)\n", " print('L2-Norm(x_{} - x_{}): {}'.format(i + 1, i, norm))\n", " if norm <= macheps:\n", " A_k, x_k = _\n", " break\n", " A_k, x_k = _\n", "print('\\nx_k: {}\\n'.format(numpy.asarray(x_k).squeeze()))\n", "print('A_k:\\n{}\\n'.format(A_k))\n", "print('J(x_k):\\n{}'.format(J(x_k)))" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 4\n", "==========\n", "\n", "Let :math:`F \\colon \\mathbb{R}^n \\rightarrow \\mathbb{R}^n` for which\n", ":math:`f_1, \\ldots, f_m` are linear with :math:`m < n`.\n", "\n", "Let :math:`A_0 = J(x_0)` where :math:`x_0 \\in \\mathbb{R}^n`. As illustrated in\n", "(2.2.1), the extension of the local affine approximation to higher dimensions\n", "(5.1.1) will be exact for :math:`f_1, \\ldots, f_m` such that\n", "\n", ".. math::\n", "\n", " f_1(x_1) = \\cdots = f_m(x_1) = 0\n", " \\quad \\text{where} \\quad\n", " x_1 = x_0 + s_0\n", " \\quad \\text{with} \\quad\n", " s_0 = -A_0^{-1} F(x_0) = -J_0^{-1}(x_0) F(x_0).\n", "\n", "The next step in Broyden's method computes\n", "\n", ".. math::\n", "\n", " A_1 &= A_0 + \\frac{y_0 - A_0 s_0}{s_0^\\top s_0} s_0^\\top\\\\\n", " &= J(x_0) + \\frac{F(x_1)}{s_0^\\top s_0} s_0^\\top\n", " & \\quad & y_0 = F(x_1) - F(x_0) \\text{ and } A_0 s_0 = -F(x_0).\n", "\n", "By inspection, the first :math:`m` rows of the outer product\n", ":math:`F(x_1) s_0^\\top` contains only zeros such that\n", ":math:`(A_1)_i = (A_0)_i = J(x_0)_i` for :math:`i = 1, \\ldots, m`.\n", "\n", "An equivalent view of Newton's method is\n", "\n", ".. math::\n", "\n", " (M_c)_i(x_c + s) = f_i(x_c) + \\nabla f_i(x_c)^\\top s\n", "\n", "where :math:`M_c` denotes the current model. From the intuition of the method\n", "in lower dimensions, it is clear that\n", "\n", ".. math::\n", "\n", " f_1(x_k) = \\cdots = f_m(x_k) = 0 \\text{ for } k \\geq 1\n", "\n", "and consequently\n", "\n", ".. math::\n", "\n", " (A_k)_i = (A_0)_i \\text{ for } k \\geq 0 \\text{ and } i = 1, \\ldots, m.\n", "\n", "For the finite difference case, Theorem 5.4.1 states that the\n", ":math:`i^\\text{th}` component has an error of :math:`\\mathcal{O}(h_i)` when\n", "using forward differencing (5.6.5) and :math:`\\mathcal{O}(h_i^2)` when using\n", "central differencing (5.6.6)." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. _dennis1996numerical-ex-8.5:\n", "\n", "Exercise 5\n", "==========\n", "\n", "Let :math:`F(x) = Jx + b` where :math:`J \\in \\mathbb{R}^{n \\times n}`,\n", ":math:`b \\in \\mathbb{R}^n`, and :math:`x \\in \\mathbb{R}^n`.\n", "\n", "The proof is by contradiction. Assume :math:`x_+, x_c \\in \\mathbb{R}^n` and\n", ":math:`x_+ \\neq x_c` with\n", "\n", ".. math::\n", "\n", " s_c &= x_+ - x_c,\\\\\\\\\n", " y_c &= F(x_+) - F(x_c),\\\\\\\\\n", " J \\not\\in Q(y_c, s_c)\n", " &= \\left\\{ B \\in \\mathbb{R}^{n \\times n} \\mid Bs = y \\right\\}.\n", "\n", "Consequently, the following must hold\n", "\n", ".. math::\n", "\n", " y_c &= F(x_+) - F(x_c)\\\\\n", " &= J x_+ + b - J x_c - b\\\\\n", " &= J (x_+ - x_c)\\\\\n", " &= J s_c,\n", "\n", "but this contradicts the initial assumptions. Thus :math:`J \\in Q(y_c, s_c)`." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 6\n", "==========\n", "\n", "Let the assumptions of Lemma 8.2.1 be satisfied and define\n", "\n", ".. math::\n", "\n", " J_+ &\\triangleq J(x_+),\\\\\\\\\n", " F_+ &\\triangleq F(x_+),\\\\\\\\\n", " F_c &\\triangleq F(x_c),\\\\\\\\\n", " s_c &= x_+ - x_c,\\\\\\\\\n", " y_c &= F_+ - F_c.\n", "\n", ".. math::\n", "\n", " A_+ &= A_c + \\frac{\\left( y_c - A_c s_c \\right) s_c^\\top}{s_c^\\top s_c}\n", " & \\quad & \\text{(8.1.5)}\\\\\n", " A_+ - J_+\n", " &= A_c - J_+ + \\frac{\\left( y_c - A_c s_c \\right) s_c^\\top}{s_c^\\top s_c}\\\\\n", " &= \\left( A_c - J_+ \\right)\n", " \\left[ I - \\frac{s_c s_c^\\top}{s_c^\\top s_c} \\right] +\n", " \\frac{\\left( y_c - J_+ s_c \\right) s_c^\\top}{s_c^\\top s_c}\n", " & \\quad & \\text{(8.2.4)}\\\\\n", " \\left\\Vert A_+ - J_+ \\right\\Vert\n", " &= \\left\\Vert\n", " \\left( A_c - J_+ \\right)\n", " \\left[ I - \\frac{s_c s_c^\\top}{s_c^\\top s_c} \\right] +\n", " \\frac{\\left( y_c - J_+ s_c \\right) s_c^\\top}{s_c^\\top s_c}\n", " \\right\\Vert\\\\\n", " &\\leq \\left\\Vert\n", " \\left( A_c - J_+ \\right)\n", " \\left[ I - \\frac{s_c s_c^\\top}{s_c^\\top s_c} \\right]\n", " \\right\\Vert +\n", " \\left\\Vert\n", " \\frac{\\left( y_c - J_+ s_c \\right) s_c^\\top}{s_c^\\top s_c}\n", " \\right\\Vert\n", " & \\quad & \\text{(3.1.1c)}\\\\\n", " &\\leq \\left\\Vert A_c - J_+ \\right\\Vert\n", " \\left\\Vert I - \\frac{s_c s_c^\\top}{s_c^\\top s_c} \\right\\Vert_2 +\n", " \\left\\Vert\n", " \\frac{\\left( y_c - J_+ s_c \\right) s_c^\\top}{s_c^\\top s_c}\n", " \\right\\Vert\n", " & \\quad & \\text{(3.1.10) and (3.1.15)}\\\\\n", " &\\leq \\left\\Vert A_c - J_c + J_c - J_+ \\right\\Vert +\n", " \\left\\Vert y_c - J_+ s_c \\right\\Vert_2\n", " \\left\\Vert \\frac{s_c}{s_c^\\top s_c} \\right\\Vert_2\n", " & \\quad & \\text{(3.1.17)}\\\\\n", " &\\leq \\left\\Vert A_c - J_c \\right\\Vert + \\left\\Vert J_c - J_+ \\right\\Vert +\n", " \\left\\Vert F_+ - F_c - J_+ s_c \\right\\Vert_2\n", " \\left\\Vert s_c \\right\\Vert_2^{-1}\n", " & \\quad & \\text{(3.1.1c) and (3.1.2c)}\\\\\n", " &\\leq \\left\\Vert A_c - J_c \\right\\Vert +\n", " \\gamma \\left\\Vert x_c - x_+ \\right\\Vert +\n", " \\frac{\\gamma}{2} \\left(\n", " \\left\\Vert x_+ - x_+ \\right\\Vert_2 +\n", " \\left\\Vert x_c - x_+ \\right\\Vert_2\n", " \\right)\n", " \\left\\Vert s_c \\right\\Vert_2\n", " \\left\\Vert s_c \\right\\Vert_2^{-1}\n", " & \\quad & J(x) \\in \\text{Lip}_\\gamma(D) \\text{ and Lemma 4.1.15}\\\\\n", " &\\leq \\left\\Vert A_c - J_c \\right\\Vert +\n", " \\frac{3 \\gamma}{2} \\left\\Vert x_+ - x_c \\right\\Vert_2\n", " & \\quad & \\text{(3.1.1c), (3.1.2c), and (3.1.6).}\n", "\n", "For a proof of (3.1.1c), see :ref:`Exercise 3.4 `.\n", ":ref:`Exercise 7 ` may be helpful towards\n", "understanding the application of (3.1.17)." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. _dennis1996numerical-ex-8.7:\n", "\n", "Exercise 7\n", "==========\n", "\n", "Suppose :math:`s \\in \\mathbb{R}^n \\setminus \\{ 0 \\}` and\n", ":math:`I \\in \\mathbb{R}^{n \\times n}` is the identity matrix.\n", "\n", "In order to simplify the derivations, rewrite the original problem as\n", "\n", ".. math::\n", "\n", " I - \\frac{s s^\\top}{s^\\top s} =\n", " I - \\frac{s s^\\top}{\\left\\Vert s \\right\\Vert_2^2} =\n", " I - u u^\\top\n", "\n", "where :math:`u = \\frac{s}{\\left\\Vert s \\right\\Vert_2}` and :math:`u^\\top u = 1`.\n", "\n", "By definition, the resulting (projection) matrix and its summands are\n", "idempotent\n", "\n", ".. math::\n", "\n", " \\left( I - u u^\\top \\right) \\left( I - u u^\\top \\right)\n", " &= I - 2u u^\\top + u u^\\top u u^\\top\\\\\n", " &= I - 2u u^\\top + u u^\\top\\\\\n", " &= I - u u^\\top\n", "\n", "and symmetric\n", "\n", ".. math::\n", "\n", " \\left( I - u u^\\top \\right)^\\top = I - u u^\\top.\n", "\n", "The rank of the matrix is\n", "\n", ".. math::\n", "\n", " \\text{rank}(I - u u^\\top)\n", " &= \\text{tr}(I - u u^\\top)\n", " & \\quad & \\text{the trace of an idempotent matrix is its rank}\\\\\n", " &= \\text{tr}(I) - \\text{tr}(u u ^\\top)\n", " & \\quad & \\text{trace is a linear mapping}\\\\\n", " &= n - \\sum_{i = 1}^n u_i^2\\\\\n", " &= n - u^\\top u\\\\\n", " &= n - 1.\n", "\n", "Let :math:`A = I - u u^\\top`. Recall that the relation between an eigenvalue\n", ":math:`\\lambda` and a non-zero eigenvector :math:`v` for :math:`A` is\n", ":math:`Av = \\lambda v`. Since :math:`A` is idempotent, the following must hold:\n", "\n", ".. math::\n", "\n", " \\lambda v = Av = A^2 v = A (Av) &= \\lambda^2 v\\\\\n", " \\boldsymbol{0} &= \\lambda (\\lambda - 1) v\n", "\n", "i.e. the eigenvalues of :math:`A` must be either :math:`0` or :math:`1`.\n", "\n", "By Theorem 3.5.6 and\n", "\n", ".. math::\n", "\n", " \\text{rank}(A) = n - 1 = \\text{tr}(A) = \\sum_{i = 1}^n \\lambda_i\n", "\n", "where :math:`\\lambda_i` are the eigenvalues of :math:`A`,\n", "\n", ".. math::\n", "\n", " \\left\\Vert I - \\frac{s s^\\top}{s^\\top s} \\right\\Vert_2 =\n", " \\max_{1 \\leq i \\leq n} \\left\\vert \\lambda_i \\right\\vert = 1." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 8\n", "==========\n", "\n", "Let :math:`D \\subseteq \\mathbb{R}^n` be an open convex set,\n", ":math:`F \\colon \\mathbb{R}^n \\rightarrow \\mathbb{R}^n`,\n", ":math:`J(x) \\in \\text{Lip}_\\gamma(D)`, :math:`x_* \\in D`, and :math:`J(x_*)` be\n", "nonsingular.\n", "\n", "Let :math:`\\{ A_k \\in \\mathbb{R}^{n \\times n} \\}` be a sequence of nonsingular\n", "matrices, and suppose for some :math:`x_0 \\in D` that the sequence of points\n", "generated by :math:`x_{k + 1} = x_k - A_k^{-1} F(x_k)` remains in :math:`D`\n", "and satisfies :math:`\\lim_{k \\rightarrow \\infty} x_k = x_*` for any :math:`k`.\n", "\n", "Define :math:`J(x_k) = J_k`, :math:`e_k = x_k - x_*`, :math:`F(x_k) = F_k`, and\n", ":math:`s_k \\triangleq x_{k + 1} - x_k`.\n", "\n", "A useful fact is\n", "\n", ".. math::\n", "\n", " \\lim_{k \\rightarrow \\infty} \\left\\Vert s_k \\right\\Vert\n", " &= \\lim_{k \\rightarrow \\infty} \\left\\Vert x_{k + 1} - x_k \\right\\Vert\\\\\n", " &= \\lim_{k \\rightarrow \\infty} \\left\\Vert x_* - x_* \\right\\Vert\\\\\n", " &= \\left\\Vert \\lim_{k \\rightarrow \\infty} x_* - x_* \\right\\Vert\n", " & \\quad & \\left\\Vert \\cdot \\right\\Vert \\text{ is continuous}\\\\\n", " &= 0.\n", "\n", "(a)\n", "---\n", "\n", "If :math:`\\lim_{k \\rightarrow \\infty}\n", "\\frac{\\left\\Vert (A_k - J_k) s_k \\right\\Vert}{\\left\\Vert s_k \\right\\Vert} = 0`\n", "where :math:`s_k \\triangleq x_{k + 1} - x_k`, then :math:`\\{ x_k \\}` converges\n", ":math:`q`-superlinearly to :math:`x_*` in some norm\n", ":math:`\\left\\Vert \\cdot \\right\\Vert` and :math:`F_* = 0`.\n", "\n", ".. math::\n", "\n", " x_{k + 1} &= x_k - A_k^{-1} F_k\n", " & \\quad & \\text{(8.2.15)}\\\\\n", " 0 &= A_k s_k + F_k\\\\\n", " 0 &= \\left( A_k - J_k \\right) s_k + F_k + J_k s_k\\\\\n", " -F_{k + 1} &= \\left( A_k - J_k \\right) s_k - F_{k + 1} + F_k + J_k s_k\\\\\n", " \\left\\Vert -F_{k + 1} \\right\\Vert\n", " &= \\left\\Vert\n", " \\left( A_k - J_k \\right) s_k - F_{k + 1} + F_k + J_k s_k\n", " \\right\\Vert\\\\\n", " \\left\\Vert F_{k + 1} \\right\\Vert\n", " &\\leq \\left\\Vert \\left( A_k - J_k \\right) s_k \\right\\Vert +\n", " \\left\\Vert F_{k + 1} - F_k - J_k s_k \\right\\Vert\n", " & \\quad & \\text{(3.1.1c) and (3.1.10)}\\\\\n", " \\frac{\\left\\Vert F_{k + 1} \\right\\Vert}{\\left\\Vert s_k \\right\\Vert}\n", " &\\leq \\frac{\n", " \\left\\Vert \\left( A_k - J_k \\right) s_k \\right\\Vert\n", " }{\n", " \\left\\Vert s_k \\right\\Vert\n", " } +\n", " \\frac{\\gamma}{2}\\left\\Vert s_k \\right\\Vert\n", " & \\quad & \\text{Lemma 4.1.15}\\\\\n", " \\lim_{k \\rightarrow \\infty}\n", " \\frac{\\left\\Vert F_{k + 1} \\right\\Vert}{\\left\\Vert s_k \\right\\Vert}\n", " &\\leq 0\n", " & \\quad & \\text{see initial assumptions.}\n", "\n", "Thus :math:`F_* = \\lim_{k \\rightarrow \\infty} F_k = 0`. The sequence\n", ":math:`\\{ x_k \\}` converges :math:`q`-superlinearly because\n", "\n", ".. math::\n", "\n", " 0 &= \\lim_{k \\rightarrow \\infty}\n", " \\frac{\\left\\Vert F_{k + 1} \\right\\Vert}{\\left\\Vert s_k \\right\\Vert}\\\\\n", " &= \\lim_{k \\rightarrow \\infty}\n", " \\frac{\n", " \\left\\Vert F_{k + 1} - F_* \\right\\Vert\n", " }{\n", " \\left\\Vert x_{k + 1} - x_k \\right\\Vert\n", " }\\\\\n", " &\\geq \\lim_{k \\rightarrow \\infty} \\alpha\n", " \\frac{\n", " \\left\\Vert e_{k + 1} \\right\\Vert\n", " }{\n", " \\left\\Vert x_{k + 1} - x_* + x_* - x_k \\right\\Vert\n", " }\n", " & \\quad & \\text{Lemma 4.1.16}\\\\\n", " &\\geq \\lim_{k \\rightarrow \\infty} \\alpha\n", " \\frac{\n", " \\left\\Vert e_{k + 1} \\right\\Vert\n", " }{\n", " \\left\\Vert e_{k + 1} \\right\\Vert + \\left\\Vert e_k \\right\\Vert\n", " }\n", " & \\quad & \\text{(3.1.1c)}\\\\\n", " &= \\lim_{k \\rightarrow \\infty} \\alpha \\frac{r_k}{1 + r_k}\n", " & \\quad & \\text{(2.3.2)}\n", "\n", "where :math:`r_k \\triangleq\n", "\\left\\Vert e_{k + 1} \\right\\Vert / \\left\\Vert e_k \\right\\Vert`.\n", "\n", "(b)\n", "---\n", "\n", "If :math:`\\{ x_k \\}` converges :math:`q`-superlinearly to :math:`x_*` in some\n", "norm :math:`\\left\\Vert \\cdot \\right\\Vert` and :math:`F(x_*) = 0`, then\n", ":math:`\\lim_{k \\rightarrow \\infty}\n", "\\frac{\\left\\Vert (A_k - J_k) s_k \\right\\Vert}{\\left\\Vert s_k \\right\\Vert} = 0`\n", "where :math:`s_k \\triangleq x_{k + 1} - x_k`.\n", "\n", ".. math::\n", "\n", " 0 &= \\lim_{k \\rightarrow \\infty}\n", " \\frac{\\left\\Vert e_{k + 1} \\right\\Vert}{\\left\\Vert e_k \\right\\Vert}\n", " & \\quad & \\text{(2.3.2)}\\\\\n", " &\\geq \\lim_{k \\rightarrow \\infty}\n", " \\frac{\n", " \\left\\Vert F_{k + 1} \\right\\Vert\n", " }{\n", " \\beta \\left\\Vert e_k \\right\\Vert\n", " }\n", " & \\quad & \\text{Lemma 4.1.16 states that }\n", " \\left\\Vert F_{k + 1} - F_* \\right\\Vert \\leq\n", " \\beta \\left\\Vert e_{k + 1} \\right\\Vert\\\\\n", " &= \\lim_{k \\rightarrow \\infty} \\beta^{-1}\n", " \\frac{\\left\\Vert F_{k + 1} \\right\\Vert}{\\left\\Vert s_k \\right\\Vert}\n", " \\frac{\\left\\Vert s_k \\right\\Vert}{\\left\\Vert e_k \\right\\Vert}.\n", "\n", "Applying Lemma 8.2.3 shows that :math:`\\lim_{k \\rightarrow \\infty}\n", "\\frac{\\left\\Vert F_{k + 1} \\right\\Vert}{\\left\\Vert s_k \\right\\Vert} = 0`. Thus\n", "\n", ".. math::\n", "\n", " x_{k + 1} &= x_k - A_k^{-1} F_k\n", " & \\quad & \\text{(8.2.15)}\\\\\n", " 0 &= A_k s_k + F_k\\\\\n", " 0 &= \\left( A_k - J_k \\right) s_k + F_k + J_k s_k\\\\\n", " -\\left( A_k - J_k \\right) s_k &= F_{k + 1} - F_{k + 1} + F_k + J_k s_k\\\\\n", " \\left\\Vert -\\left( A_k - J_k \\right) s_k \\right\\Vert\n", " &= \\left\\Vert F_{k + 1} - F_{k + 1} + F_k + J_k s_k \\right\\Vert\\\\\n", " \\left\\Vert \\left( A_k - J_k \\right) s_k \\right\\Vert\n", " &\\leq \\left\\Vert F_{k + 1} \\right\\Vert +\n", " \\left\\Vert F_{k + 1} - F_k - J_k s_k \\right\\Vert\n", " & \\quad & \\text{(3.1.1c) and (3.1.10)}\\\\\n", " \\frac{\n", " \\left\\Vert \\left( A_k - J_k \\right) s_k \\right\\Vert\n", " }{\n", " \\left\\Vert s_k \\right\\Vert\n", " } &\\leq \\frac{\\left\\Vert F_{k + 1} \\right\\Vert}{\\left\\Vert s_k \\right\\Vert} +\n", " \\frac{\\gamma}{2} \\left\\Vert s_k \\right\\Vert\n", " & \\quad & \\text{Lemma 4.1.15}\\\\\n", " \\lim_{k \\rightarrow \\infty}\n", " \\frac{\n", " \\left\\Vert \\left( A_k - J_k \\right) s_k \\right\\Vert\n", " }{\n", " \\left\\Vert s_k \\right\\Vert\n", " }\n", " &\\leq 0\n", " & \\quad & \\text{see initial assumptions.}" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. _dennis1996numerical-ex-8.9:\n", "\n", "Exercise 9\n", "==========\n", "\n", "The purpose of Example 8.2.6 is to illustrate that Broyden's method can get\n", "relatively close to the true Jacobian.\n", "\n", "This exercise is more along the lines of Lemma 8.2.7 where the relative error\n", "is much greater: in this case it is between :math:`39\\%` and :math:`620\\%`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def F(x):\n", " x = x.flat\n", " return numpy.asmatrix([[x[0]**2 + x[1]**2 - 2],\n", " [numpy.exp(x[0] - 1) + x[1]**3 - 2]])\n", "\n", "def J(x):\n", " x = x.flat\n", " return numpy.asmatrix([[2 * x[0], 2 * x[1]],\n", " [numpy.exp(x[0] - 1), 3 * x[1]**2]])\n", "\n", "macheps = numpy.sqrt(numpy.finfo(numpy.float64).eps)\n", "x_k = numpy.asmatrix([[2], [3]])\n", "A_k = J(x_k)\n", "for i in range(16):\n", " _ = broydens_update(F, A_k, x_k)\n", " norm = numpy.linalg.norm(_[1] - x_k)\n", " print('L2-Norm(x_{} - x_{}): {}'.format(i + 1, i, norm))\n", " if norm <= macheps:\n", " A_k, x_k = _\n", " break\n", " A_k, x_k = _\n", "\n", "print('\\nx_k: {}\\n'.format(numpy.asarray(x_k).squeeze()))\n", "print('A_k:\\n{}\\n'.format(A_k))\n", "print('J(x_k):\\n{}'.format(J(x_k)))\n", "\n", "for index, (a, j) in enumerate(zip(A_k.flat, J(x_k).flat)):\n", " _ = 100 * abs(a - j) / abs(j)\n", " print('\\nRelative Error at {}: {:0.2f}%'.format(index, _))" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. _dennis1996numerical-ex-8.10:\n", "\n", "Exercise 10\n", "===========\n", "\n", "The following extension of (2.2.1) to higher dimensions (5.1.1) will be useful\n", "in the proof that follows.\n", "\n", "Suppose :math:`F(x) = Ax + b` is an affine transformation where\n", ":math:`A \\in \\mathbb{R}^{n \\times n}` and :math:`F'(x) = A = J(x)` is\n", "nonsingular. For all :math:`k \\geq 1`, :math:`F(x_k) = 0` because\n", "\n", ".. math::\n", "\n", " x_{k + 1} &= x_k - s_k = x_k - J^{-1}(x_k) F(x_k)\\\\\n", " x_1 &= x_0 - A^{-1} (A x_0 + b) = -A^{-1} b\\\\\n", " x_2 &= x_1 - A^{-1} \\left[ A (-A^{-1} b) + b \\right] = -A^{-1} b.\n", "\n", "Let :math:`F(x) = \\begin{bmatrix} J_1 x + b_1\\\\ F_2(x) \\end{bmatrix}` where\n", ":math:`J_1 \\in \\mathbb{R}^{m \\times n}`, :math:`b_1 \\in \\mathbb{R}^m`, and\n", ":math:`F_2 \\colon \\mathbb{R}^n \\rightarrow \\mathbb{R}^{n - m}` is nonlinear.\n", "\n", "(a)\n", "---\n", "\n", "In Broyden's method, the initial :math:`A_0` can be calculated analytically as\n", "\n", ".. math::\n", "\n", " A_0 = J(x) &= \\frac{F(x)}{\\partial x^\\top}\\\\\n", " &= \\begin{bmatrix} J_1\\\\ F'_2(x) \\end{bmatrix}\n", " & \\quad & F'_2(x) = \\frac{\\partial F_2(x)}{\\partial x^\\top}\\\\\n", " &= \\begin{bmatrix} A_{01}\\\\ A_{02} \\end{bmatrix}\n", " & \\quad & A_{01} = J_1 \\text{ and } A_{02} = F'_2(x).\n", "\n", "For the finite difference case, Theorem 5.4.1 states that :math:`i\\text{th}`\n", "component has an error of :math:`\\mathcal{O}(h_i)` when using forward difference\n", "(5.6.5) and :math:`\\mathcal{O}(h_i^2)` when using central difference (5.6.6).\n", "\n", ".. math::\n", "\n", " A_1 &= A_0 + \\frac{y_0 - A_0 s_0}{s_0^\\top s_0} s_0^\\top\\\\\n", " &= J(x_0) + \\frac{F(x_1)}{s_0^\\top s_0} s_0^\\top\n", " & \\quad y_k = F(x_{k + 1}) - F(x_k)\\\\\n", " &= \\begin{bmatrix} J_1\\\\ F'_2(x_0) \\end{bmatrix} +\n", " \\left\\Vert s_0 \\right\\Vert_2^{-2}\n", " \\begin{bmatrix} J_1 x_1 + b_1\\\\ F_2(x_1) \\end{bmatrix} s_0^\\top\n", "\n", "By inspection, the first :math:`m` rows of the outer product\n", ":math:`F(x_1) s_0^\\top` contains only zeros. Thus by induction\n", ":math:`A_{k1} = J_1 \\forall k \\geq 0`.\n", "\n", "(b)\n", "---\n", "\n", "Generalizing the update sequence\n", "\n", ".. math::\n", "\n", " A_{k + 1} &= A_k + \\frac{y_k - A_k s_k}{s_k^\\top s_k} s_k^\\top\\\\\n", " &= \\begin{bmatrix} J_1\\\\ A_{k2} \\end{bmatrix} +\n", " \\left\\Vert s_k \\right\\Vert_2^{-2}\n", " \\begin{bmatrix} J_1 x_k + b_1\\\\ F_2(x_k) \\end{bmatrix} s_k^\\top\n", "\n", "gives\n", "\n", ".. math::\n", "\n", " A_{(k + 1)2} = A_{k2} + \\left\\Vert s_k \\right\\Vert_2^{-2} F_2(x_k) s_k^\\top.\n", "\n", "Consequently,\n", "\n", ".. math::\n", "\n", " \\left( A_{k2} - A_{12} \\right) J_1^\\top\n", " &= \\left(\n", " \\sum_{i = 1}^{k - 1}\n", " \\left\\Vert s_i \\right\\Vert_2^{-2} F_2(x_i) s_i^\\top\n", " \\right) J_1^\\top\\\\\n", " &= \\sum_{i = 1}^{k - 1}\n", " \\left\\Vert s_i \\right\\Vert_2^{-2} F_2(x_i)\n", " \\left( J_1 s_i \\right)^\\top\\\\\n", " &= \\sum_{i = 1}^{k - 1}\n", " \\left\\Vert s_i \\right\\Vert_2^{-2} F_2(x_i)\n", " \\left[ J_1 \\left( x_{i + 1} - x_i \\right) \\right]^\\top\\\\\n", " &= 0 & \\quad & J_1 x_k + b_1 = 0 \\, \\forall \\, k \\geq 1\n", "\n", "implies the sequence :math:`\\{ A_{k2} \\}` will converge to the correct value\n", ":math:`F'_2(x_*)` when the nonlinear solution intersects with the affine\n", "solution :math:`J_1`." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 11\n", "===========\n", "\n", "The relative error for :math:`x_0 = (0.5, 0.5)^\\top` is between\n", ":math:`2.8\\%` and :math:`43.6\\%`, which is not as terrible as the starting\n", "point for :ref:`Exercise 9 `." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def F(x):\n", " x = x.flat\n", " return numpy.asmatrix([[x[0]**2 + x[1]**2 - 2],\n", " [numpy.exp(x[0] - 1) + x[1]**3 - 2]])\n", "\n", "def J(x):\n", " x = x.flat\n", " return numpy.asmatrix([[2 * x[0], 2 * x[1]],\n", " [numpy.exp(x[0] - 1), 3 * x[1]**2]])\n", "\n", "macheps = numpy.sqrt(numpy.finfo(numpy.float64).eps)\n", "x_k = numpy.asmatrix([[0.5], [0.5]])\n", "A_k = J(x_k)\n", "for i in range(16):\n", " _ = broydens_update(F, A_k, x_k)\n", " norm = numpy.linalg.norm(_[1] - x_k)\n", " print('L2-Norm(x_{} - x_{}): {}'.format(i + 1, i, norm))\n", " if norm <= macheps:\n", " A_k, x_k = _\n", " break\n", " A_k, x_k = _\n", "\n", "print('\\nx_k: {}\\n'.format(numpy.asarray(x_k).squeeze()))\n", "print('A_k:\\n{}\\n'.format(A_k))\n", "print('J(x_k):\\n{}'.format(J(x_k)))\n", "\n", "for index, (a, j) in enumerate(zip(A_k.flat, J(x_k).flat)):\n", " _ = 100 * abs(a - j) / abs(j)\n", " print('\\nRelative Error at {}: {:0.2f}%'.format(index, _))" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 12\n", "===========\n", "\n", "Let :math:`\\hat{x} = D_x x` and :math:`\\hat{F}(\\hat{x}) = D_F F(x)`. The\n", "following are useful for the proof.\n", "\n", ".. math::\n", "\n", " \\hat{A}_k \\hat{s}_k = -\\hat{F}(\\hat{x}_k) = -D_F F(x) = D_F A_k s_k\n", "\n", ".. math::\n", "\n", " \\hat{s}_k = \\hat{x}_{k + 1} - \\hat{x}_k = D_x x_{k + 1} - D_x x_k = D_x s_k\n", "\n", ".. math::\n", "\n", " \\hat{y}_k =\n", " \\hat{F}(\\hat{x}_{k + 1}) - \\hat{F}(\\hat{x}_k) =\n", " D_F F(x_{k + 1}) - D_F F(x_k) =\n", " D_F y_k\n", "\n", "Suppose :math:`A_k \\simeq J_k`.\n", "\n", ".. math::\n", "\n", " \\hat{A}_{k + 1}\n", " &= \\hat{A}_k +\n", " \\frac{\n", " \\left( \\hat{y}_k \\hat{A}_k \\hat{s}_k \\right) \\hat{s}_k^\\top\n", " }{\n", " \\hat{s}_k^\\top \\hat{s}_k\n", " }\\\\\\\\\n", " D_F J_{k + 1} D_x^{-1}\n", " &= D_F J_k D_x^{-1} +\n", " \\frac{\n", " \\left( D_F y_k - D_F A_k s_k \\right) s_k^\\top D_x\n", " }{\n", " s_k^\\top D_x^2 s_k\n", " }\\\\\\\\\n", " A_{k + 1}\n", " &= A_k +\n", " \\frac{\\left( y_k - A_k s_k \\right) s_k^\\top D_x^2}{s_k^\\top D_x^2 s_k}\n", "\n", "where the second equality is based on the foregoing results and\n", ":ref:`Exercise 7.4 `." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. _dennis1996numerical-ex-8.13:\n", "\n", "Exercise 13\n", "===========\n", "\n", "Suppose :math:`u, v \\in \\mathbb{R}^n` and :math:`A \\in \\mathbb{R}^{n \\times n}`\n", "is nonsingular.\n", "\n", "Recall that the matrix determinant lemma states that\n", "\n", ".. math::\n", "\n", " \\DeclareMathOperator{\\det}{det}\n", " \\det\\left(\n", " \\begin{bmatrix} I & 0\\\\ v^\\top & 1 \\end{bmatrix}\n", " \\begin{bmatrix} I + uv^\\top & u\\\\ 0 & 1 \\end{bmatrix}\n", " \\begin{bmatrix} I & 0\\\\ -v^\\top & 1 \\end{bmatrix}\n", " \\right)\n", " &= \\det\\left(\n", " \\begin{bmatrix} I & u\\\\ 0 & 1 + v^\\top u \\end{bmatrix}\n", " \\right)\\\\\n", " \\det\\left( I + uv^\\top \\right)\n", " &= \\det\\left( 1 + v^\\top u \\right).\n", "\n", "(a)\n", "---\n", "\n", ".. math::\n", "\n", " \\det\\left( A + uv^\\top \\right)\n", " &= \\det\\left( A \\left[I + A^{-1} uv^\\top \\right] \\right)\\\\\n", " &= \\det(A) \\det\\left(I + \\left( A^{-1} u \\right) v^\\top \\right)\\\\\n", " &= \\det(A) \\det\\left(1 + v^\\top A^{-1} u \\right)\n", " & \\quad & \\text{matrix determinant lemma}\n", "\n", "(b)\n", "---\n", "\n", "Observe that\n", ":math:`\\left( I + uv^\\top \\right)^{-1} = I - \\frac{u v^\\top}{\\sigma}` where\n", ":math:`\\sigma \\triangleq 1 + v^\\top u` because\n", "\n", ".. math::\n", "\n", " \\left( I + uv^\\top \\right) \\left( I + uv^\\top \\right)^{-1}\n", " &= \\left( I + uv^\\top \\right) \\left( I - \\frac{1}{\\sigma} u v^\\top \\right)\\\\\n", " &= I - \\frac{uv^\\top}{\\sigma} + uv^\\top - \\frac{uv^\\top uv^\\top}{\\sigma}\\\\\n", " &= I + uv^\\top - \\frac{\\left( 1 + v^\\top u \\right) uv^\\top}{\\sigma}\\\\\n", " &= I.\n", "\n", "Applying the foregoing result gives\n", "\n", ".. math::\n", "\n", " \\left[ A + uv^\\top \\right]^{-1}\n", " &= \\left[ A \\left(I + A^{-1} uv^\\top \\right) \\right]^{-1}\\\\\n", " &= \\left(I + A^{-1} uv^\\top \\right)^{-1} A^{-1}\\\\\n", " &= \\left(I + \\frac{A^{-1} uv^\\top}{1 + v^\\top A^{-1} u} \\right) A^{-1}\\\\\n", " &= A^{-1} + \\frac{1}{\\sigma} A^{-1} uv^\\top A^{-1}." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 14\n", "===========\n", "\n", "See the previous exercises for a basic implementation of Broyden's method.\n", "\n", "Problems that are not smooth and have varying magnitudes in each dimension could\n", "cause the secant approximation to fail to converge." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 15\n", "===========\n", "\n", "(a)\n", "---\n", "\n", "Suppose :math:`B \\in Q(y, s)` such that :math:`B s = y` where\n", ":math:`s \\neq 0` and :math:`B` is nonsingular.\n", "\n", ".. math::\n", "\n", " \\left\\Vert A_+^{-1} - A^{-1} \\right\\Vert\n", " &= \\left\\Vert\n", " \\frac{\\left( s - A^{-1} y \\right) y^\\top}{y^\\top y}\n", " \\right\\Vert\n", " & \\quad & \\text{(8.4.2)}\\\\\n", " &= \\left\\Vert\n", " \\frac{\\left( B^{-1} - A^{-1} \\right) y y^\\top}{y^\\top y}\n", " \\right\\Vert\\\\\n", " &\\leq \\left\\Vert B^{-1} - A^{-1} \\right\\Vert\n", " \\left\\Vert \\frac{y y^\\top}{y^\\top y} \\right\\Vert\n", " & \\quad & \\text{(8.1.6)}\\\\\n", " &= \\left\\Vert B^{-1} - A^{-1} \\right\\Vert\n", " & \\quad & \\text{(8.1.7)}\n", "\n", "(b)\n", "---\n", "\n", "Suppose :math:`A` is nonsingular and :math:`y^\\top A s \\neq 0`.\n", "\n", ".. math::\n", "\n", " \\left( A_+^{-1} \\right)^{-1}\n", " &= \\left(\n", " A^{-1} + \\frac{\\left( s - A^{-1} y \\right) y^\\top}{y^\\top y}\n", " \\right)^{-1}\n", " & \\quad & \\text{(8.4.2)}\\\\\n", " &= A -\n", " \\left(\n", " 1 + \\frac{y^\\top A \\left( s - A^{-1} y \\right)}{y^\\top y}\n", " \\right)^{-1}\n", " A \\frac{\\left( s - A^{-1} y \\right) y^\\top}{y^\\top y} A\n", " & \\quad & \\text{Lemma 8.3.1}\\\\\n", " &= A -\n", " \\left( \\frac{y^\\top As}{y^\\top y} \\right)^{-1}\n", " \\frac{\\left( As - y \\right) y^\\top A}{y^\\top y}\\\\\n", " A_+ &= A + \\frac{\\left( y - As \\right) y^\\top A}{y^\\top As}" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 16\n", "===========\n", "\n", "The bad Broyden update is strongly influenced by the scaling of a norm's\n", "residuals. The storage requirement is twice that of the good update. It\n", "does not exploit sparsity. See :cite:`griewank2012broyden` for an excellent\n", "exposition on this matter." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 17\n", "===========\n", "\n", "Suppose :math:`A_k \\in \\mathbb{R}^{n \\times n}`,\n", ":math:`s_i, y_i \\in \\mathbb{R}^n` for :math:`i = k - 1, k` where\n", ":math:`s_{k - 1}` and :math:`s_k` linearly independent, and\n", ":math:`A_k s_{k - 1} = y_{k - 1}`.\n", "\n", "(a)\n", "---\n", "\n", ".. math::\n", "\n", " A_{k + 1} s_{k - 1}\n", " &= \\left(\n", " A_k + \\frac{\\left( y_k - A_k s_k \\right) v_k^\\top}{v_k^\\top s_k}\n", " \\right) s_{k - 1}\\\\\n", " &= A_k s_{k - 1} +\n", " \\frac{\\left( y_k - A_k s_k \\right) v_k^\\top s_{k - 1}}{v_k^\\top s_k}\\\\\n", " &= y_{k - 1} +\n", " \\frac{\\left( y_k - A_k s_k \\right) v_k^\\top s_{k - 1}}{v_k^\\top s_k}\n", "\n", ":math:`v_k^\\top s_{k - 1} = 0` when :math:`v_k` is defined as the result of\n", "the Gram-Schmidt orthogonalization on the linearly independent set\n", ":math:`\\left\\{ s_{k - 1}, s_k \\right\\}`:\n", "\n", ".. math::\n", "\n", " v_k = s_k - \\frac{s_k^\\top s_{k - 1}}{s_{k - 1}^\\top s_{k - 1}} s_{k - 1}.\n", "\n", "(b)\n", "---\n", "\n", "Suppose\n", "\n", ".. math::\n", "\n", " v_k = s_k - \\frac{s_k^\\top s_{k - 1}}{s_{k - 1}^\\top s_{k - 1}} s_{k - 1}\n", "\n", "and :math:`B \\in Q(y_k, s_k) \\cap Q(y_{k - 1}, s_{k - 1})` such that\n", ":math:`B s_k = y_k` and :math:`B s_{k - 1} = y_{k - 1}`.\n", "\n", ".. math::\n", "\n", " \\left\\Vert A_{k + 1} - A_k \\right\\Vert\n", " &= \\left\\Vert\n", " \\frac{\\left( y_k - A_k s_k \\right) v_k^\\top}{v_k^\\top s_k}\n", " \\right\\Vert\n", " & \\quad & \\text{(8.4.5)}\\\\\n", " &= \\left\\Vert\n", " \\frac{\\left( B - A_k \\right) s_k v_k^\\top}{v_k^\\top s_k}\n", " \\right\\Vert\\\\\n", " &= \\left\\Vert\n", " \\frac{\\left( B - A_k \\right) v_k v_k^\\top}{v_k^\\top s_k}\n", " \\right\\Vert\n", " & \\quad & \\text{(b.1)}\\\\\n", " &\\leq \\left\\Vert B - A_k \\right\\Vert\n", " \\left\\Vert \\frac{v_k v_k^\\top}{v_k^\\top s_k} \\right\\Vert_2\n", " & \\quad & \\text{(3.1.10), (3.1.15), (3.1.17)}\\\\\n", " &= \\left\\Vert B - A_k \\right\\Vert\n", " & \\quad & \\text{(b.2)}\n", "\n", "Since the Frobenius norm is strictly convex and :math:`Q(y, s)` is an affine\n", "subset of :math:`\\mathbb{R}^{n \\times n}` (hence convex), (8.4.5) gives a\n", "unique solution.\n", "\n", "(b.1)\n", "-----\n", "\n", ".. math::\n", "\n", " \\left( B - A_k \\right) s_k\n", " &= \\left( B - A_k \\right)\n", " \\left(\n", " v_k + \\frac{s_k^\\top s_{k - 1}}{s_{k - 1}^\\top s_{k - 1}} s_{k - 1}\n", " \\right)\\\\\n", " &= \\left( B - A_k \\right) v_k +\n", " \\frac{s_k^\\top s_{k - 1}}{s_{k - 1}^\\top s_{k - 1}}\n", " \\left( B s_{k - 1} - A_k s_{k - 1} \\right)\\\\\n", " &= \\left( B - A_k \\right) v_k +\n", " \\frac{s_k^\\top s_{k - 1}}{s_{k - 1}^\\top s_{k - 1}}\n", " \\left( y_{k - 1} - y_{k - 1} \\right)\\\\\n", " &= \\left( B - A_k \\right) v_k\n", "\n", "(b.2)\n", "-----\n", "\n", ".. math::\n", "\n", " v_k^\\top s_k\n", " &= v_k^\\top \\left(\n", " v_k + \\frac{s_k^\\top s_{k - 1}}{s_{k - 1}^\\top s_{k - 1}} s_{k - 1}\n", " \\right)\\\\\n", " &= v_k^\\top v_k +\n", " \\frac{s_k^\\top s_{k - 1}}{s_{k - 1}^\\top s_{k - 1}} v_k^\\top s_{k - 1}\\\\\n", " &= \\left\\Vert v_k \\right\\Vert_2^2\n", " & \\quad & \\text{(a) shows that } v_k^\\top s_{k - 1} = 0\n", "\n", "Define :math:`u_k = \\frac{v_k}{\\left\\Vert v_k \\right\\Vert_2}`. Applying\n", "(3.1.17) yields\n", "\n", ".. math::\n", "\n", " \\left\\Vert \\frac{v_k v_k^\\top}{v_k^\\top s_k} \\right\\Vert_2 =\n", " \\left\\Vert u_k u_k^\\top \\right\\Vert_2 =\n", " \\left\\Vert u_k \\right\\Vert_2^2 = 1." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 18\n", "===========\n", "\n", "Suppose for some :math:`m < n`, :math:`A_k \\in \\mathbb{R}^{n \\times n}`,\n", ":math:`s_j, y_j \\in \\mathbb{R}^n` for :math:`j = k - m, \\ldots, k`,\n", ":math:`s_{k - m}, \\ldots, s_k` are linearly independent, and\n", ":math:`A_k s_i = y_i` for :math:`i = k - m, \\ldots, k - 1`.\n", "\n", "(a)\n", "---\n", "\n", ".. math::\n", "\n", " A_{k + 1} s_i\n", " &= \\left(\n", " A_k + \\frac{\\left( y_k - A_k s_k \\right) v_k^\\top}{v_k^\\top s_k}\n", " \\right) s_i\\\\\n", " &= A_k s_i +\n", " \\frac{\\left( y_k - A_k s_k \\right) v_k^\\top}{v_k^\\top s_k} s_i\\\\\n", " &= y_i + \\frac{\\left( y_k - A_k s_k \\right)}{v_k^\\top s_k} v_k^\\top s_i\n", "\n", ":math:`v_k^\\top s_i = 0` when :math:`v_k` is defined as the result of the\n", "Gram-Schmidt orthogonalization on the linearly independent set\n", ":math:`\\left\\{ s_{k - m}, \\ldots, s_k \\right\\}`.\n", "\n", ".. math::\n", "\n", " v_k =\n", " s_k - \\sum_{i = k - m}^{k - 1} \\text{proj}_{v_i}(s_k) =\n", " s_k - \\sum_{i = k - m}^{k - 1} \\frac{s_k^\\top v_i}{v_i^\\top v_i} v_i\n", "\n", "because\n", "\n", ".. math::\n", "\n", " v_k^\\top s_i\n", " &= v_k^\\top \\left(\n", " v_i + \\sum_{z = k - m}^{k - 1} \\text{proj}_{v_z}(s_i)\n", " \\right)\\\\\n", " &= v_k^\\top v_i +\n", " \\sum_{z = k - m}^{k - 1} \\frac{s_i^\\top v_z}{v_z^\\top v_z} v_k^\\top v_z\\\\\n", " &= 0.\n", "\n", "(b)\n", "---\n", "\n", "Define :math:`v_k` as in (a) and :math:`B \\in \\bigcap_{j} Q(y_j, s_j)` such that\n", ":math:`B s_j = y_j`.\n", "\n", ".. math::\n", "\n", " \\left\\Vert A_{k + 1} - A_k \\right\\Vert\n", " &= \\left\\Vert\n", " \\frac{\\left( y_k - A_k s_k \\right) v_k^\\top}{v_k^\\top s_k}\n", " \\right\\Vert\n", " & \\quad & \\text{(8.4.5)}\\\\\n", " &= \\left\\Vert\n", " \\frac{\\left( B - A_k \\right) s_k v_k^\\top}{v_k^\\top s_k}\n", " \\right\\Vert\\\\\n", " &= \\left\\Vert\n", " \\frac{\\left( B - A_k \\right) v_k v_k^\\top}{v_k^\\top s_k}\n", " \\right\\Vert\n", " & \\quad & \\text{(b.1)}\\\\\n", " &\\leq \\left\\Vert B - A_k \\right\\Vert\n", " \\left\\Vert \\frac{v_k v_k^\\top}{v_k^\\top s_k} \\right\\Vert_2\n", " & \\quad & \\text{(3.1.10), (3.1.15), (3.1.17)}\\\\\n", " &= \\left\\Vert B - A_k \\right\\Vert\n", " & \\quad & \\text{(b.2)}\n", "\n", "Since the Frobenius norm is strictly convex and :math:`Q(y, s)` is an affine\n", "subset of :math:`\\mathbb{R}^{n \\times n}` (hence convex), (8.4.5) gives a\n", "unique solution.\n", "\n", "(b.1)\n", "-----\n", "\n", ".. math::\n", "\n", " \\left( B - A_k \\right) s_k\n", " &= \\left( B - A_k \\right)\n", " \\left(\n", " v_k + \\sum_{i = k - m}^{k - 1} \\text{proj}_{v_i}(s_k)\n", " \\right)\\\\\n", " &= \\left( B - A_k \\right) v_k +\n", " \\sum_{i = k - m}^{k - 1}\n", " \\frac{s_k^\\top v_i}{v_i^\\top v_i}\n", " \\left( B v_i - A_k v_i \\right)\n", " & \\quad & \\text{(b.1.1)}\\\\\n", " &= \\left( B - A_k \\right) v_k\n", "\n", "(b.1.1)\n", "-------\n", "\n", "Proof via induction:\n", "\n", "For the base case :math:`v_{k - m} = s_{k - m}`,\n", "\n", ".. math::\n", "\n", " \\left( B - A_k \\right) v_{k - m} =\n", " B s_{k - m} - A_k s_{k - m} =\n", " y_{k - m} - y_{k - m} = 0.\n", "\n", "For the induction step, assume :math:`\\left( B - A_k \\right) v_z = 0` for\n", ":math:`z = k - m, \\ldots, k - 2`.\n", "\n", ".. math::\n", "\n", " \\left( B - A_k \\right) v_{k - 1}\n", " &= \\left( B - A_k \\right)\n", " \\left(\n", " s_{k - 1} - \\sum_{z = k - m}^{(k - 1) - 1} \\text{proj}_{v_z}(s_{k - 1})\n", " \\right)\\\\\n", " &= B s_{k - 1} - A_k s_{k - 1} -\n", " \\sum_{z = k - m}^{k - 2}\n", " \\frac{s_{k - 1}^\\top v_z}{v_z^\\top v_z}\n", " \\left( B - A_k \\right) v_z\\\\\n", " &= y_{k - 1} - y_{k - 1}\\\\\n", " &= 0\n", "\n", "(b.2)\n", "-----\n", "\n", ".. math::\n", "\n", " v_k^\\top s_k\n", " &= v_k^\\top \\left(\n", " v_k + \\sum_{i = k - m}^{k - 1} \\text{proj}_{v_i}(s_k)\n", " \\right)\\\\\n", " &= v_k^\\top v_k +\n", " \\sum_{i = k - m}^{k - 1} \\frac{s_k^\\top v_i}{v_i^\\top v_i} v_k^\\top v_i\\\\\n", " &= \\left\\Vert v_k \\right\\Vert_2^2\n", " & \\quad & \\text{(a) shows that } v_k^\\top v_i = 0\n", "\n", "Define :math:`u_k = \\frac{v_k}{\\left\\Vert v_k \\right\\Vert_2}`. Applying\n", "(3.1.17) yields\n", "\n", ".. math::\n", "\n", " \\left\\Vert \\frac{v_k v_k^\\top}{v_k^\\top s_k} \\right\\Vert_2 =\n", " \\left\\Vert u_k u_k^\\top \\right\\Vert_2 =\n", " \\left\\Vert u_k \\right\\Vert_2^2 = 1." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. rubric:: References\n", "\n", ".. bibliography:: chapter-08.bib" ] } ], "metadata": { "anaconda-cloud": {}, "celltoolbar": "Raw Cell Format", "kernelspec": { "display_name": "Python [default]", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.2" } }, "nbformat": 4, "nbformat_minor": 0 }