{
 "cells": [
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "**********************************************************************\n",
    "Newton's Method for Nonlinear Equations and Unconstrained Minimization\n",
    "**********************************************************************"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 1\n",
    "=========="
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import numpy\n",
    "\n",
    "def F(x):\n",
    "    return numpy.asfarray([2 * (x[0] + x[1])**2 + (x[0] - x[1])**2 - 8,\n",
    "                           5 * x[0]**2 + (x[1] - 3)**2 - 9])\n",
    "\n",
    "def J(x):\n",
    "    return numpy.asfarray([[6 * x[0] + 2 * x[1], 2 * x[0] + 6 * x[1]],\n",
    "                           [10 * x[0], 2 * x[1] - 6]])\n",
    "\n",
    "x_c = numpy.asfarray([2, 0])\n",
    "for i in range(2):\n",
    "    s_k = numpy.dot(numpy.linalg.inv(J(x_c)), -F(x_c))\n",
    "    x_k = x_c + s_k\n",
    "    print('x_{0} = {1}'.format(i, x_c))\n",
    "    x_c = x_k"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 2\n",
    "==========\n",
    "\n",
    "The Jacobian becomes ill-conditioned in the second iteration i.e. it becomes the\n",
    "zero matrix."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import numpy\n",
    "\n",
    "def F(x):\n",
    "    return numpy.exp(x) - 1\n",
    "\n",
    "def J(x):\n",
    "    return numpy.asfarray([[numpy.exp(x[0]), 0],\n",
    "                           [0, numpy.exp(x[1])]])\n",
    "\n",
    "x_c = numpy.asfarray([-10, -10])\n",
    "for i in range(2):\n",
    "    _ = J(x_c)\n",
    "    s_k = numpy.dot(numpy.linalg.inv(_), -F(x_c))\n",
    "    x_k = x_c + s_k\n",
    "    print('Iteration {0}'.format(i))\n",
    "    print('x_{0} = {1}'.format(i, x_c))\n",
    "    print('s_{0} = {1}'.format(i, s_k))\n",
    "    print('J_{0} =\\n{1}'.format(i, _))\n",
    "    print('x_{0} = {1}'.format(i + 1, x_k))\n",
    "    x_c = x_k"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 3\n",
    "==========\n",
    "\n",
    "Define :math:`f_1(x) = x, f_2(x) = x^2 + x,` and :math:`f_3(x) = e^x - 1`.\n",
    "\n",
    "(a)\n",
    "---\n",
    "\n",
    ".. math::\n",
    "\n",
    "   f'_1(x) = 1 \\Rightarrow f'_1(x_* = 0) = 1\n",
    "\n",
    ".. math::\n",
    "\n",
    "   f'_2(x) = 2x + 1 \\Rightarrow f'_2(x_* = 0) = 1\n",
    "\n",
    ".. math::\n",
    "\n",
    "   f'_3(x) = e^x \\Rightarrow f'_3(x_* = 0) = 1\n",
    "\n",
    "(b)\n",
    "---\n",
    "\n",
    "The Lipschitz constant for each derivative at the root :math:`x_* = 0` is\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left| f'_1(x) - f'_1(0) \\right| &\\leq \\gamma_1 \\left| x - 0 \\right|\\\\\n",
    "   \\left| \\frac{1 - 1}{x} \\right| &\\leq \\gamma_1\\\\\n",
    "   0 &\\leq \\gamma_1\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left| f'_2(x) - f'_2(0) \\right| &\\leq \\gamma_2 \\left| x - 0 \\right|\\\\\n",
    "   \\left| \\frac{2x + 1 - 1}{x} \\right| &\\leq \\gamma_2\\\\\n",
    "   2 &\\leq \\gamma_2\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left| f'_3(x) - f'_3(0) \\right| &\\leq \\gamma_3 \\left| x - 0 \\right|\\\\\n",
    "   \\left| \\frac{e^x - 1}{x} \\right| &\\leq \\gamma_3\\\\\n",
    "    &\\leq e^x \\leq \\gamma_3\n",
    "\n",
    "The last inequality holds because the Bernoulli inequality gives\n",
    "\n",
    ".. math::\n",
    "\n",
    "   1 + x \\leq \\left( 1 + \\frac{x}{n} \\right)^n\n",
    "   \\xrightarrow{n \\rightarrow \\infty} \\exp x.\n",
    "\n",
    "Note that :math:`e^x` is an upper bound when :math:`x > 0` and becomes a lower\n",
    "bound when :math:`x < 0`.\n",
    "\n",
    "(c)\n",
    "---\n",
    "\n",
    "The derivative of each function is bounded on the interval :math:`[-a, a]`, so\n",
    "each function is Lipschitz continuous.  The bound for each derivative is\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left| f'_1(x) \\right| &= 1\\\\\n",
    "    &\\geq \\rho_1 = 1\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left| f'_2(x) \\right| &= \\left| 2x + 1 \\right|\\\\\n",
    "    &\\geq \\rho_2 = 0\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left| f'_3(x) \\right| &= \\left| e^x \\right|\\\\\n",
    "    &\\geq \\rho_3 = e^{-a}\n",
    "\n",
    "The smallest region of convergence is defined to be\n",
    ":math:`\\eta = \\min \\left\\{ \\hat{\\eta}, 2 \\rho / \\gamma \\right\\}` where\n",
    ":math:`\\hat{\\eta}` is defined to be the radius of the largest open interval\n",
    "around :math:`x_*`.\n",
    "\n",
    "(d)\n",
    "---\n",
    "\n",
    ":math:`f_1` will converge to :math:`x_* = 0` in the interval\n",
    ":math:`(-\\infty, \\infty)` since it only has one root and\n",
    ":math:`\\lim_{\\gamma \\rightarrow 0^+} \\frac{2 \\rho_1}{\\gamma_1} = \\infty`.\n",
    "\n",
    ":math:`f_2` violates the assumption that :math:`\\rho_2 > 0` at :math:`x = -0.5`\n",
    "causing Newton's method to fail because the derivative is zero.  Splitting the\n",
    "interval into :math:`(-\\infty, -0.5)` and :math:`(-0.5, \\infty)` sidesteps that\n",
    "issue.  Applying Theorem 2.4.3 to each region shows that any point in those\n",
    "regions will converge.\n",
    "\n",
    ":math:`f_3` will converge to :math:`x_* = 0` in the interval\n",
    ":math:`(-\\infty, \\infty)` since it only has one root in theory and\n",
    ":math:`\\lim_{a \\rightarrow \\infty} \\frac{2 \\rho_3}{\\gamma_3} =\n",
    "\\lim_{a \\rightarrow \\infty} 2 e^{-2a} = 0`.  However, the floating-point\n",
    "precision restricts the region to :math:`[-6.5, 709]`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import numpy\n",
    "\n",
    "def Newtons_Method(f, f_prime, x_c):\n",
    "    return x_c - f(x_c) / f_prime(x_c)\n",
    "\n",
    "f = [lambda x: x,\n",
    "     lambda x: x**2 + x,\n",
    "     lambda x: numpy.exp(x) - 1\n",
    "    ]\n",
    "f_prime = [lambda x: 1,\n",
    "           lambda x: 2 * x + 1,\n",
    "           lambda x: numpy.exp(x)]\n",
    "x_c = [-7.5, -0.6, 7.5]\n",
    "for i in range(16):\n",
    "    for j in range(len(x_c)):\n",
    "        x_c[j] = Newtons_Method(f[j], f_prime[j], x_c[j])\n",
    "    print('x_{0} = {1}'.format(i + 1, x_c))"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 4\n",
    "==========\n",
    "\n",
    ".. math::\n",
    "\n",
    "   J(x) =\n",
    "   \\begin{bmatrix}\n",
    "     1 & 0 & 0\\\\\n",
    "     0 & 2 x_2 + 1 & 0\\\\\n",
    "     0 & 0 & e^{x_3}\n",
    "   \\end{bmatrix}\n",
    "\n",
    "(a)\n",
    "---\n",
    "\n",
    ".. math::\n",
    "\n",
    "   J(x_*) =\n",
    "   \\begin{bmatrix}\n",
    "     1 & 0 & 0\\\\\n",
    "     0 & 1 & 0\\\\\n",
    "     0 & 0 & 1\n",
    "   \\end{bmatrix}\n",
    "\n",
    "(b)\n",
    "---\n",
    "\n",
    "The Lipschitz constant on :math:`J(x) \\in \\text{Lip}_\\gamma(N(x_*, a))` is\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\Vert J(x) - J(x_*) \\right\\Vert\n",
    "    &\\leq \\gamma \\left\\Vert x - x_* \\right\\Vert\n",
    "          & \\quad & \\text{(4.1.11)}\\\\\n",
    "   \\left\\Vert\n",
    "     \\begin{bmatrix}\n",
    "       0 & 0 & 0\\\\\n",
    "       0 & 2 x_2 & 0\\\\\n",
    "       0 & 0 & e^{x_3} - 1\n",
    "     \\end{bmatrix}\n",
    "   \\right\\Vert\n",
    "    &< \\gamma a\n",
    "       & \\quad & d(x_*, x) < a\\\\\n",
    "   \\frac{\n",
    "     \\max \\left\\{\n",
    "       \\left\\vert 2 x_2 \\right\\vert,\n",
    "       \\left\\vert e^{x_3} - 1 \\right\\vert\n",
    "     \\right\\}\n",
    "   }{a} &< \\gamma\n",
    "           & \\quad & \\text{1-norm is an upper bound for the numerator}\n",
    "\n",
    "(c)\n",
    "---\n",
    "\n",
    "1-norm is the largest possible value for\n",
    ":math:`\\left\\Vert J(x_*)^{-1} \\right\\Vert`, so the region of convergence is\n",
    "defined as\n",
    ":math:`\\epsilon = \\min \\left\\{ a, \\frac{1}{2 \\beta \\gamma} \\right\\}` where\n",
    ":math:`0 < \\left\\Vert J(x_*)^{-1} \\right\\Vert \\leq 1 \\leq \\beta`.  Hence\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\epsilon =\n",
    "   \\min \\left\\{\n",
    "     a,\n",
    "     \\frac{a}{\n",
    "       \\max \\left\\{\n",
    "         \\left\\vert 4 x_2 \\right\\vert, \\left\\vert 2e^{x_3} - 2 \\right\\vert\n",
    "       \\right\\}\n",
    "     }\n",
    "   \\right\\}\n",
    "\n",
    "(d)\n",
    "---\n",
    "\n",
    "When :math:`(x_0)_3 = 0,\n",
    "\\epsilon = \\min \\left\\{ a, \\frac{a}{\\left\\vert 4 x_2 \\right\\vert} \\right\\}`.\n",
    "When :math:`(x_0)_2 = (x_0)_3 = 0, \\epsilon = a` because as\n",
    ":math:`(x_0)_2 \\rightarrow 0` or :math:`(x_0)_3 \\rightarrow 0`, the denominator\n",
    "grows infinitely large."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 5\n",
    "==========\n",
    "\n",
    "Proof by induction starts at the base case of :math:`x_0`, so the goal is to\n",
    "show :math:`x_1 = x_0 - J(x_0)^{-1} F(x_0)` is well-defined i.e. :math:`J(x_0)`\n",
    "is nonsingular.\n",
    "\n",
    "Since :math:`J(x_*)^{-1}` exists and\n",
    ":math:`\\left\\Vert J(x_*)^{-1} \\right\\Vert \\leq \\beta` where :math:`\\beta > 0` by\n",
    "assumption, :math:`J(x_*)` is nonsingular.  Theorem 3.14 states that\n",
    ":math:`J(x_0)` is nonsingular when\n",
    ":math:`\\left\\Vert J(x_*)^{-1} \\left( J(x_0) - J(x_*) \\right) \\right\\Vert < 1`\n",
    "holds.  Recall that :math:`J` is Holder continuous within\n",
    ":math:`N(x_*, r)` where :math:`r > 0` and :math:`\\alpha \\in (0, 1]` such that\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\Vert J(x_*)^{-1} \\left( J(x_0) - J(x_*) \\right) \\right\\Vert\n",
    "    &\\leq \\left\\Vert\n",
    "            J(x_*)^{-1} \\right\\Vert \\left\\Vert J(x_0) - J(x_*)\n",
    "          \\right\\Vert\\\\\n",
    "    &\\leq \\beta \\gamma \\left\\Vert x_0 - x_* \\right\\Vert^\\alpha\\\\\n",
    "    &\\leq \\beta \\gamma r^\\alpha\n",
    "\n",
    "Picking :math:`\\epsilon = \\min \\left\\{ r,\n",
    "\\left( \\frac{\\alpha}{(1 + \\alpha) \\beta \\gamma} \\right)^{1 / \\alpha} \\right\\}`\n",
    "satisfies Theorem 3.14 in addition to revealing the elegant inequality\n",
    ":math:`\\beta \\gamma \\epsilon^\\alpha \\leq \\frac{\\alpha}{1 + \\alpha} < 1` that\n",
    "simplifies later derivations.  Thus :math:`J(x_0)` is nonsingular and its\n",
    "induced matrix norm is bounded above by\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\Vert J(x_0) \\right\\Vert\n",
    "    &\\leq \\frac{\n",
    "            \\left\\Vert J(x_*)^{-1} \\right\\Vert\n",
    "          }{\n",
    "            1 -\n",
    "            \\left\\Vert J(x_*)^{-1} \\left( J(x_0) - J(x_*) \\right) \\right\\Vert\n",
    "          }\n",
    "          & \\quad & \\text{perturbation relation 3.1.20}\\\\\n",
    "    &\\leq \\beta \\frac{1}{1 - \\beta \\gamma \\epsilon^\\alpha}\\\\\n",
    "    &\\leq \\beta (1 + \\alpha).\n",
    "\n",
    "Observe that the sequence :math:`x_1, x_2, \\ldots` converges to :math:`x_*`\n",
    "because\n",
    "\n",
    ".. math::\n",
    "\n",
    "   x_1 - x_* &= x_0 - J(x_0)^{-1} F(x_0) - x_*\\\\\n",
    "    &= x_0 - J(x_0)^{-1} \\lbrack F(x_0) - F(x_*) \\rbrack - x_*\n",
    "       & \\quad & F(x_*) = 0\\\\\n",
    "    &= J(x_0)^{-1} \\lbrack F(x_*) - F(x_0) - J(x_0) (x_* - x_0) \\rbrack\\\\\n",
    "   \\left\\Vert x_1 - x_* \\right\\Vert\n",
    "    &= \\left\\Vert\n",
    "         J(x_0)^{-1} \\lbrack F(x_*) - F(x_0) - J(x_0) (x_* - x_0) \\rbrack\n",
    "       \\right\\Vert\\\\\n",
    "   \\left\\Vert x_1 - x_* \\right\\Vert\n",
    "    &\\leq \\left\\Vert J(x_0)^{-1} \\right\\Vert\n",
    "          \\left\\Vert F(x_*) - F(x_0) - J(x_0) (x_* - x_0) \\right\\Vert\n",
    "          & \\quad & \\text{(3.1.10)}\\\\\n",
    "    &\\leq \\beta (1 + \\alpha)\n",
    "          \\left\\Vert x_* - x_0 \\right\\Vert^{1 + \\alpha}\n",
    "          \\frac{\\gamma}{1 + \\alpha}\n",
    "          & \\quad & p = x_* - x_0\\\\\n",
    "    &= \\beta \\gamma \\left\\Vert x_0 - x_* \\right\\Vert^{1 + \\alpha}\n",
    "       & \\quad & \\text{(3.1.1b).}\n",
    "\n",
    "To understand why :math:`p = x_* - x_0`, see :ref:`Exercise 4.8 <dennis1996numerical-ex-4.8>`."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    ".. _dennis1996numerical-ex-5.6:\n",
    "\n",
    "Exercise 6\n",
    "==========\n",
    "\n",
    "By definition of differentiability and continuity,\n",
    "\n",
    ".. math::\n",
    "\n",
    "   f(t) = \\frac{\\gamma}{2} t^2 - \\frac{t}{\\beta} + \\frac{\\eta}{\\beta}\n",
    "\n",
    "is continuously differentiable :math:`\\forall t \\in \\mathbb{R}`,\n",
    ":math:`\\gamma, \\eta \\in \\mathbb{R}^+_0`, and :math:`\\beta \\in \\mathbb{R}^+`.\n",
    "Assuming :math:`t_0 = 0`, :math:`N(t_0, r)` is satisfied\n",
    ":math:`\\forall r \\in (0, \\infty)`.\n",
    "\n",
    "Since :math:`f(t): \\mathbb{R} \\rightarrow \\mathbb{R}`, the vector norm and\n",
    "induced operator norm reduce respectively to an absolute value and\n",
    ":math:`J(t) = f'(t) = \\gamma t - \\frac{1}{\\beta}`.\n",
    "\n",
    "Notice that :math:`J \\in \\text{Lip}_\\gamma(N(t_0, r))` because\n",
    ":math:`\\left\\vert f'(t) \\right\\vert <\n",
    "\\left\\vert \\gamma r \\right\\vert + \\left\\vert \\frac{1}{\\beta} \\right\\vert`.\n",
    "\n",
    "The initial conditions for the induced operator norms\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\Vert J(t_0)^{-1} \\right\\Vert =\n",
    "   \\left\\vert \\left( \\gamma t_0 - \\frac{1}{\\beta} \\right)^{-1} \\right\\vert =\n",
    "   \\beta\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\Vert J(t_0)^{-1} F(t_0) \\right\\Vert =\n",
    "   \\left\\vert\n",
    "     \\left( \\gamma t_0 - \\frac{1}{\\beta} \\right)^{-1}\n",
    "     \\left(\n",
    "       \\frac{\\gamma}{2} t_0^2 - \\frac{t_0}{\\beta} + \\frac{\\eta}{\\beta}\n",
    "     \\right)\n",
    "   \\right\\vert =\n",
    "   \\eta\n",
    "\n",
    "are satisfied by definition of :math:`f(t)`.\n",
    "\n",
    "Define :math:`\\gamma_\\text{Rel} = \\beta \\gamma, \\alpha = \\gamma_\\text{Rel} \\eta`\n",
    "for ease of notation.  Analytically solving for the roots of the univariate\n",
    "quadratic function :math:`f(t_*) = 0` gives\n",
    "\n",
    ".. math::\n",
    "\n",
    "   t_*\n",
    "    &= \\frac{\n",
    "         \\beta^{-1} \\pm \\sqrt{\\beta^{-2} - 2 \\gamma \\eta \\beta^{-1}}\n",
    "       }{\\gamma}\\\\\n",
    "    &= \\frac{\\beta^{-1} \\pm \\sqrt{\\frac{1 - 2 \\alpha}{\\beta^2}}}{\\gamma}\\\\\n",
    "    &= \\frac{1 \\pm \\sqrt{1 - 2 \\alpha}}{\\beta \\gamma}.\n",
    "\n",
    "Thus :math:`\\alpha \\leq \\frac{1}{2}` is a necessary condition for :math:`f(t)`\n",
    "to have real roots.\n",
    "\n",
    "The following results will be useful in proving :math:`\\{t_k\\}` converges.\n",
    "\n",
    "Proof of :math:`\\left\\vert f'(t_{k + 1}) \\right\\vert \\geq\n",
    "\\left\\vert \\frac{f'(t_k)}{2} \\right\\vert` via induction.  The following shows\n",
    "that the base case holds:\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\vert f'(t_1) \\right\\vert\n",
    "    &= \\left\\vert \\gamma t_1 - \\frac{1}{\\beta}\\right\\vert\\\\\n",
    "    &= \\left\\vert\n",
    "         \\gamma \\left( t_0 - J(t_0)^{-1} F(t_0) \\right) - \\frac{1}{\\beta}\n",
    "       \\right\\vert\\\\\n",
    "    &= \\left\\vert \\gamma \\eta - \\frac{1}{\\beta}\\right\\vert\\\\\n",
    "    &= \\left\\vert \\frac{\\alpha - 1}{\\beta}\\right\\vert\\\\\n",
    "    &= \\frac{1 - \\alpha}{\\beta}\\\\\n",
    "    &\\geq \\left\\vert \\frac{f'(t_0)}{2} \\right\\vert\n",
    "          & \\quad & \\frac{1}{2 \\beta} \\leq\n",
    "                    \\frac{1 - \\alpha}{\\beta} \\leq \\frac{1}{\\beta}\n",
    "\n",
    "Assume :math:`\\left\\vert f'(t_k) \\right\\vert \\geq\n",
    "\\left\\vert \\frac{f'(t_{k - 1})}{2} \\right\\vert` holds.  The inductive step holds\n",
    "because\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\vert f'(t_{k + 1}) \\right\\vert\n",
    "    &= \\left\\vert \\gamma t_{k + 1} - \\frac{1}{\\beta}\\right\\vert\\\\\n",
    "    &= \\left\\vert\n",
    "         \\gamma \\left( t_k - J(t_k)^{-1} F(t_k) \\right) - \\frac{1}{\\beta}\n",
    "       \\right\\vert\\\\\n",
    "    &= \\left\\vert\n",
    "         \\gamma \\left(\n",
    "           t_k -\n",
    "           \\left( \\frac{\\beta \\gamma t_k - 1}{\\beta} \\right)^{-1}\n",
    "               \\left(\n",
    "                 \\frac{\\beta \\gamma t_k^2 - 2 t_k + 2 \\eta}{2 \\beta}\n",
    "               \\right)\n",
    "         \\right) -\n",
    "         \\frac{1}{\\beta}\n",
    "       \\right\\vert\\\\\n",
    "    &= \\left\\vert\n",
    "         \\gamma t_k -\n",
    "         \\frac{\n",
    "           \\frac{\\beta \\gamma^2 t_k^2}{2} - \\gamma t_k + \\gamma \\eta\n",
    "         }{\\beta \\gamma t_k - 1} -\n",
    "         \\frac{1}{\\beta}\n",
    "       \\right\\vert\\\\\n",
    "    &= \\left\\vert\n",
    "         \\gamma t_k -\n",
    "         \\frac{\n",
    "           \\frac{\\gamma_\\text{Rel}^2 t_k^2}{2} -\n",
    "           \\gamma_\\text{Rel} t_k + \\alpha + \\gamma_\\text{Rel} t_k - 1\n",
    "         }{\\beta \\gamma_\\text{Rel} t_k - \\beta}\n",
    "       \\right\\vert\\\\\n",
    "    &= \\left\\vert\n",
    "         \\gamma t_k +\n",
    "         \\frac{\n",
    "           1 - \\alpha - \\frac{\\gamma_\\text{Rel}^2 t_k^2}{2}\n",
    "         }{\\beta \\gamma_\\text{Rel} t_k - \\beta}\n",
    "       \\right\\vert\\\\\n",
    "    &= \\left\\vert\n",
    "         \\gamma t_k +\n",
    "         \\frac{\n",
    "           1 - \\alpha -\n",
    "           \\left( \\beta^2 \\gamma t_k - \\beta \\right) \\frac{\\gamma t_k}{2} -\n",
    "           \\frac{\\beta \\gamma t_k}{2}\n",
    "         }{\\beta^2 \\gamma t_k - \\beta}\n",
    "    \\right\\vert\\\\\n",
    "    &= \\left\\vert\n",
    "         \\frac{\\gamma t_k}{2} -\n",
    "         \\frac{\n",
    "           1 - \\alpha - \\frac{\\gamma_\\text{Rel} t_k}{2}\n",
    "         }{\\beta \\left( 1 - \\gamma_\\text{Rel} t_k \\right)}\n",
    "       \\right\\vert\\\\\n",
    "    &= \\left\\vert\n",
    "         \\frac{\\gamma t_k}{2} -\n",
    "         \\frac{1}{2 \\beta}\n",
    "             \\frac{\n",
    "               2 - 2 \\alpha - \\gamma_\\text{Rel} t_k\n",
    "             }{1 - \\gamma_\\text{Rel} t_k}\n",
    "       \\right\\vert\\\\\n",
    "    &\\geq \\left\\vert \\frac{\\gamma t_k}{2} - \\frac{1}{2 \\beta} \\right\\vert\\\\\n",
    "    &= \\left\\vert \\frac{f'(t_k)}{2} \\right\\vert.\n",
    "\n",
    "The last step can be proven via contradiction.  Assume\n",
    ":math:`0 \\leq \\alpha \\leq \\frac{1}{2}` and\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\vert \\gamma_\\text{Rel} t_k - 1 \\right\\vert >\n",
    "   \\left\\vert\n",
    "     \\gamma_\\text{Rel} t_k -\n",
    "     \\frac{2 - 2 \\alpha - \\gamma_\\text{Rel} t_k}{1 - \\gamma_\\text{Rel} t_k}\n",
    "   \\right\\vert.\n",
    "\n",
    "Given :math:`x = \\gamma_\\text{Rel} t_k`,\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\vert x - 1 \\right\\vert\n",
    "    &> \\left\\vert x - \\frac{2 - 2 \\alpha - x}{1 - x} \\right\\vert\\\\\n",
    "   \\left\\vert x - 1 \\right\\vert \\left\\vert 1 - x \\right\\vert\n",
    "    &> \\left\\vert x - x^2 - 2 + 2 \\alpha + x \\right\\vert\\\\\n",
    "   \\left\\vert x - 1 \\right\\vert^4\n",
    "    &> \\left\\vert x - x^2 - 2 + 2 \\alpha + x \\right\\vert^2\\\\\n",
    "   (4 \\alpha - 2) x^2 + (4 - 8 \\alpha) x - 4 \\alpha^2 + 8 \\alpha - 3 &> 0\\\\\n",
    "   (2 \\alpha - 1) (2x^2 - 4x + 3 - 2 \\alpha) &> 0.\n",
    "\n",
    "Solving for roots of the quadratic equation yields\n",
    "\n",
    ".. math::\n",
    "\n",
    "   x =\n",
    "   \\frac{-(-4) \\pm \\sqrt{(-4)^2 - 4 (2) (3 - 2 \\alpha)}}{2 (2)} =\n",
    "   1 \\pm \\sqrt{\\alpha - \\frac{1}{2}}\n",
    "\n",
    "which means :math:`\\alpha > \\frac{1}{2}` must hold to yield a real solution, but\n",
    "this contradicts the assumption that :math:`\\alpha \\leq \\frac{1}{2}`.  Therefore\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\vert \\gamma_\\text{Rel} t_k - 1 \\right\\vert\n",
    "    \\leq \\left\\vert\n",
    "           \\gamma_\\text{Rel} t_k -\n",
    "           \\frac{\n",
    "             2 - 2 \\alpha - \\gamma_\\text{Rel} t_k\n",
    "           }{1 - \\gamma_\\text{Rel} t_k}\n",
    "         \\right\\vert\n",
    "   \\quad \\text{and} \\quad\n",
    "   \\left\\vert f'(t_{k + 1}) \\right\\vert\n",
    "    \\geq \\left\\vert 2^{-(k + 1)} \\beta^{-1} \\right\\vert.\n",
    "\n",
    "The upper bound :math:`\\left\\vert f'(t_{k + 1}) \\right\\vert \\leq\n",
    "\\left\\vert f'(t_k) \\right\\vert` can also be derived as\n",
    "\n",
    ".. math::\n",
    "\n",
    "   2 \\left\\vert x - 1 \\right\\vert\n",
    "    &\\geq \\left\\vert x - \\frac{2 - 2 \\alpha - x}{1 - x} \\right\\vert\\\\\n",
    "   3x^4 - 12x^3 + (16 + 4 \\alpha)x^2 - (8 + 8 \\alpha)x +\n",
    "       8 \\alpha - 4 \\alpha^2 &\\geq 0\\\\\n",
    "   (x^2 - 2x + 2 \\alpha) (3x^2 - 6x + 4 - 2 \\alpha) &\\geq 0.\n",
    "\n",
    "Solving for roots of the quartic equation yields\n",
    "\n",
    ".. math::\n",
    "\n",
    "   x = \\frac{-(-2) \\pm \\sqrt{(-2)^2 - 4 (1) (2 \\alpha)}}{2 (1)} =\n",
    "   1 \\pm \\sqrt{1 - 2 \\alpha}\n",
    "\n",
    "and\n",
    "\n",
    ".. math::\n",
    "\n",
    "   x = \\frac{-(-6) \\pm \\sqrt{(-6)^2 - 4 (3) (4 - 2 \\alpha)}}{2 (3)} =\n",
    "   1 \\pm \\sqrt{\\frac{2 \\alpha - 1}{3}}.\n",
    "\n",
    "Since :math:`0 \\leq \\alpha \\leq \\frac{1}{2}`, the second quadratic formula is\n",
    "inconsequential for real roots.  Therefore the sequence :math:`\\{ t_k \\}` is\n",
    "well-defined with :math:`N(t_0, r)` because\n",
    "\n",
    ".. math::\n",
    "\n",
    "   t_{k + 1} - t_* &= t_k - J(t_k)^{-1} F(t_k) - t_*\\\\\n",
    "    &= J(t_k)^{-1} \\left[ F(t_*) - F(t_k) - J(t_k) (t_* - t_k) \\right]\\\\\n",
    "   \\left\\vert t_{k + 1} - t_* \\right\\vert\n",
    "    &\\leq 2^k \\beta \\frac{\\gamma}{2} \\left\\vert t_* - t_k \\right\\vert^2\n",
    "          & \\quad & \\text{Lemma 4.1.12}\\\\\n",
    "    &= 2^{k - 1} \\beta \\gamma \\left\\vert t_k - t_* \\right\\vert^2\n",
    "       & \\quad & \\text{(3.1.1b)}\\\\\n",
    "    &\\leq 2^{k - 1} r^2 \\frac{\\eta}{\\alpha}.\n",
    "\n",
    "This inequality can be further simplified assuming :math:`r \\geq r_0`\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\vert t_1 - t_* \\right\\vert &\\leq 2^{-1} r^2 \\frac{\\eta}{\\alpha}\\\\\n",
    "    &\\leq 2^{-1} \\left( 1 - \\sqrt{1 - 2 \\alpha} \\right)^2 \\frac{\\eta}{\\alpha}.\n",
    "\n",
    "Unrolling the recursion for :math:`t_k` yields\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\vert t_k - t_* \\right\\vert\n",
    "    &\\leq 2^{-k} \\left( 1 - \\sqrt{1 - 2 \\alpha} \\right)^{2^k}\n",
    "          \\frac{\\eta}{\\alpha}\\\\\n",
    "    &\\leq (2 \\alpha)^{2^k} \\frac{\\eta}{\\alpha}.\n",
    "\n",
    "The last step can be proven via induction.  The base case when :math:`k = 0`\n",
    "gives :math:`1 - \\sqrt{1 - 2 \\alpha} \\leq 2 \\alpha`.  Let us prove this via\n",
    "contradiction: assume :math:`0 \\leq \\alpha \\leq \\frac{1}{2}` and\n",
    ":math:`1 - \\sqrt{1 - 2 \\alpha} > 2 \\alpha`.\n",
    "\n",
    ".. math::\n",
    "\n",
    "   1 - \\sqrt{1 - 2 \\alpha} &> 2 \\alpha\\\\\n",
    "   \\left( 1 - 2 \\alpha \\right)^2 &> \\sqrt{1 - 2 \\alpha}^2\\\\\n",
    "   4 \\alpha^2 - 4 \\alpha + 1 &> 1 - 2 \\alpha\\\\\n",
    "   4 \\alpha^2 - 2 \\alpha > 0\\\\\n",
    "   \\alpha (2\\alpha - 1) > 0\n",
    "\n",
    "shows that :math:`\\alpha > \\frac{1}{2}` must hold to satisfy the inequality, but\n",
    "this contradicts the assumption that :math:`\\alpha \\leq \\frac{1}{2}`.  Thus the\n",
    "base case is true.\n",
    "\n",
    "For the inductive step, assume\n",
    ":math:`2^{-k} \\left( 1 - \\sqrt{1 - 2 \\alpha} \\right)^{2^k} \\leq\n",
    "\\left( 2 \\alpha \\right)^{2^k}` for :math:`0 \\leq \\alpha \\leq \\frac{1}{2}`.\n",
    "\n",
    ".. math::\n",
    "\n",
    "   2^{-(k + 1)} \\left( 1 - \\sqrt{1 - 2 \\alpha} \\right)^{2^{k + 1}}\n",
    "    &\\leq \\left( 2 \\alpha \\right)^{2^{k + 1}}\\\\\n",
    "   -(k + 1) + 2^{k + 1} \\log_2 \\left( 1 - \\sqrt{1 - 2 \\alpha} \\right)\n",
    "    &\\leq 2^{k + 1} \\log_2 \\left( 2 \\alpha \\right)\\\\\n",
    "   \\frac{-(k + 1)}{2^{k + 1}} + \\log_2 \\left( 1 - \\sqrt{1 - 2 \\alpha} \\right)\n",
    "    &\\leq \\log_2 \\left( 2 \\alpha \\right)\\\\\n",
    "   2^{\\frac{-(k + 1)}{2^{k + 1}}} \\left( 1 - \\sqrt{1 - 2 \\alpha} \\right)\n",
    "    &\\leq 2 \\alpha\\\\\n",
    "   c \\left( 1 - \\sqrt{1 - 2 \\alpha} \\right)\n",
    "    &\\leq 2 \\alpha\n",
    "          & \\quad & c = 2^{\\frac{-(k + 1)}{2^{k + 1}}} > 0.\n",
    "\n",
    "By inspection, as :math:`k \\rightarrow \\infty`, :math:`c \\rightarrow 1`,\n",
    "which completes the inductive step."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    ".. _dennis1996numerical-ex-5.7:\n",
    "\n",
    "Exercise 7\n",
    "==========\n",
    "\n",
    ":math:`\\left\\Vert J(x_n)^{-1} \\right\\Vert \\leq \\left\\vert f'(t_n)^{-1} \\right\\vert`\n",
    "-----------------------------------------------------------------------------------\n",
    "\n",
    "When :math:`n = 0`, :ref:`Exercise 6 <dennis1996numerical-ex-5.6>` shows that\n",
    ":math:`\\left\\vert f'(t_0)^{-1} \\right\\vert = \\beta` and :math:`F` is presumed to\n",
    "satisfy the assumptions of the Kantorovich theorem\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\Vert J(x_0)^{-1} \\right\\Vert \\leq \\left\\vert f'(t_0)^{-1} \\right\\vert.\n",
    "\n",
    "For the inductive step :math:`n = k`, recall Theorem 3.14 states that\n",
    ":math:`J(x_k)` is nonsingular when\n",
    ":math:`\\left\\Vert J(x_0)^{-1} \\left( J(x_k) - J(x_0) \\right) \\right\\Vert < 1`\n",
    "holds.\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\Vert J(x_0)^{-1} \\left( J(x_k) - J(x_0) \\right) \\right\\Vert\n",
    "    &\\leq \\left\\Vert\n",
    "            J(x_0)^{-1} \\right\\Vert \\left\\Vert J(x_k) - J(x_0)\n",
    "          \\right\\Vert\\\\\n",
    "    &\\leq \\beta \\gamma r\n",
    "          & \\quad & J \\in \\text{Lip}_\\gamma(N(x_0, r > 0))\n",
    "\n",
    "Picking :math:`r = \\frac{1 - \\sqrt{1 - 2 \\alpha}}{\\beta \\gamma}` where\n",
    ":math:`0 \\leq \\alpha \\leq \\frac{1}{2}` satisfies Theorem 3.14.  Thus\n",
    ":math:`J(x_k)` is nonsingular and its induced matrix norm is bounded above by\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\Vert J(x_k)^{-1} \\right\\Vert\n",
    "    &\\leq \\frac{\n",
    "            \\left\\Vert J(x_0)^{-1} \\right\\Vert\n",
    "          }{\n",
    "            1 -\n",
    "            \\left\\Vert J(x_0)^{-1} \\left( J(x_k) - J(x_0) \\right) \\right\\Vert\n",
    "          }\n",
    "          & \\quad & \\text{(3.1.20)}\\\\\n",
    "    &\\leq \\frac{\\left\\vert f'(t_0)^{-1} \\right\\vert}{ 1 - c}\n",
    "          & \\quad & 0 \\leq c = 1 -\n",
    "          \\left\\Vert\n",
    "            J(x_0)^{-1} \\left( J(x_k) - J(x_0) \\right)\n",
    "          \\right\\Vert < 1\\\\\n",
    "    &\\leq \\frac{\\left\\vert f'(t_k)^{-1} \\right\\vert}{1 - c}\\\\\n",
    "   (1 - c) \\left\\Vert J(x_k)^{-1} \\right\\Vert\n",
    "    &\\leq \\left\\vert f'(t_k)^{-1} \\right\\vert.\n",
    "\n",
    "The second to last inequality is explained in\n",
    ":ref:`Exercise 6 <dennis1996numerical-ex-5.6>`.\n",
    "\n",
    ":math:`\\left\\Vert x_{n + 1} - x_n \\right\\Vert \\leq \\left\\vert t_{n + 1} - t_n \\right\\vert`\n",
    "------------------------------------------------------------------------------------------\n",
    "\n",
    "When :math:`n = 0`, :ref:`Exercise 6 <dennis1996numerical-ex-5.6>` shows that\n",
    ":math:`\\left\\Vert J(t_0)^{-1} F(t_0) \\right\\Vert = \\eta` and :math:`F` is\n",
    "presumed to satisfy the assumptions of the Kantorovich theorem\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\Vert x_1 - x_0 \\right\\Vert \\leq \\left\\vert t_1 - t_0 \\right\\vert.\n",
    "\n",
    "The following relations will be useful for the induction.\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\Vert F(x_k) \\right\\Vert\n",
    "    &= \\left\\Vert\n",
    "         F(x_k) - F(x_{k - 1}) - J(x_{k - 1}) (x_k - x_{k -  1})\n",
    "       \\right\\Vert\n",
    "       & \\quad & \\text{(4.1.10)}\\\\\n",
    "    &= \\left\\Vert F(x_{k - 1} + p) - F(x_{k - 1}) - J(x_{k - 1}) p \\right\\Vert\n",
    "       & \\quad & p = x_k - x_{k - 1}\\\\\n",
    "    &\\leq \\frac{\\gamma}{2} \\left\\Vert x_k - x_{k - 1} \\right\\Vert^2\n",
    "          & \\quad & \\text{Lemma 4.1.12}\n",
    "\n",
    "Applying the same analysis to\n",
    ":math:`f(t_k) = \\frac{\\gamma}{2} (t_k - t_{k - 1})^2` gives\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\vert f(t_k) \\right\\vert\n",
    "    &= \\left\\vert\n",
    "         f(t_k) - f(t_{k - 1}) - f'(t_{k - 1}) (t_k - t_{k - 1})\n",
    "       \\right\\vert\n",
    "       & \\quad & \\text{(4.1.10)}\\\\\n",
    "    &= \\left\\vert\n",
    "         \\frac{\\gamma}{2} t_k^2 - \\frac{t_k}{\\beta} + \\frac{\\eta}{\\beta} -\n",
    "         \\frac{\\gamma}{2} t_{k - 1}^2 + \\frac{t_{k - 1}}{\\beta} -\n",
    "         \\frac{\\eta}{\\beta} -\n",
    "         \\left( \\gamma t_{k - 1} - \\frac{1}{\\beta} \\right) (t_k - t_{k - 1})\n",
    "       \\right\\vert\\\\\n",
    "    &= \\left\\vert\n",
    "         \\frac{\\gamma}{2} t_k^2 - \\frac{t_k}{\\beta} -\n",
    "         \\frac{\\gamma}{2} t_{k - 1}^2 + \\frac{t_{k - 1}}{\\beta} -\n",
    "         \\gamma t_{k - 1} t_k + \\gamma t_{k - 1}^2 + \\frac{t_k}{\\beta} -\n",
    "         \\frac{t_{k - 1}}{\\beta}\n",
    "       \\right\\vert\\\\\n",
    "    &= \\left\\vert\n",
    "         \\frac{\\gamma}{2} t_k^2 - \\gamma t_k t_{k - 1} +\n",
    "         \\frac{\\gamma}{2} t_{k - 1}^2\n",
    "       \\right\\vert\\\\\n",
    "    &= \\frac{\\gamma}{2} \\left( t_k - t_{k - 1} \\right)^2.\n",
    "\n",
    "The inductive step for :math:`n = k` is\n",
    "\n",
    ".. math::\n",
    "\n",
    "   x_{k + 1} &= x_k - J(x_k)^{-1} F(x_k)\\\\\n",
    "   \\left\\Vert x_{k + 1} - x_k \\right\\Vert\n",
    "    &= \\left\\Vert -J(x_k)^{-1} F(x_k) \\right\\Vert\\\\\n",
    "    &\\leq \\left\\vert f'(t_k)^{-1} \\right\\vert \\left\\Vert F(x_k) \\right\\Vert\\\\\n",
    "    &= \\left\\vert t_{k + 1} - t_k \\right\\vert\n",
    "       \\frac{\\left\\Vert F(x_k) \\right\\Vert}{\\left\\vert f(t_k) \\right\\vert}\n",
    "       & \\quad & \\left\\vert t_{k + 1} - t_k \\right\\vert =\n",
    "                 \\left\\vert f'(t_k)^{-1} f(t_k)\\right\\vert\\\\\n",
    "    &\\leq \\left\\vert t_{k + 1} - t_k \\right\\vert\n",
    "          & \\quad & \\text{ratio is less than one by the inductive step.}"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 8\n",
    "==========\n",
    "\n",
    "As stated in the section after Theorem 5.3.1,\n",
    ":ref:`Exercise 6 <dennis1996numerical-ex-5.6>` and\n",
    ":ref:`Exercise 7 <dennis1996numerical-ex-5.7>` shows that\n",
    ":math:`\\left\\Vert x_* - x_k \\right\\Vert` is bounded above by\n",
    ":math:`\\left\\vert t_* - t_k \\right\\vert`; hence it is a convergent sequence.\n",
    "Every convergent sequence is a Cauchy sequence, so :math:`\\{ x_k \\}` is a Cauchy\n",
    "sequence that will converge to some :math:`x_*`.  This technique of proof is\n",
    "known as majorization where :math:`\\{ t_k \\}` is said to majorize\n",
    ":math:`\\{ x_k \\}`.\n",
    "\n",
    "See :cite:`ortega1968newton` for a very concise and formal proof."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 9\n",
    "==========\n",
    "\n",
    "See :cite:`bryan1968approximate` for an elegant and concise proof that enables\n",
    "the evaluation of nonlinear integral equations using numerical quadrature."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    ".. _dennis1996numerical-ex-5.10:\n",
    "\n",
    "Exercise 10\n",
    "===========\n",
    "\n",
    "The contractive mapping theorem is also known as the Banach fixed-point theorem\n",
    "or the contraction mapping principle.\n",
    "\n",
    "Given :math:`G: D \\rightarrow D` where :math:`D` is a closed subset of\n",
    ":math:`\\mathbb{R}^n`, for any starting point :math:`x_0 \\in D`, the sequence\n",
    ":math:`\\{ x_k \\}` generated by :math:`x_{k + 1} = G(x_k)` remains in :math:`D`\n",
    "for :math:`k = 0, 1, \\ldots`.\n",
    "\n",
    "(a)\n",
    "---\n",
    "\n",
    "Observe that recursively applying\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\Vert x_{k + 1} - x_k \\right\\Vert =\n",
    "   \\left\\Vert G(x_k) - G(x_{k - 1}) \\right\\Vert \\leq\n",
    "   \\alpha \\left\\Vert x_k - x_{k - 1} \\right\\Vert\n",
    "\n",
    "yields\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\Vert x_{k + 1} - x_k \\right\\Vert \\leq\n",
    "   \\alpha^k \\left\\Vert x_1 - x_0 \\right\\Vert\n",
    "\n",
    "where :math:`\\alpha \\in [0,1)`.\n",
    "\n",
    "For any :math:`j \\geq 0`,\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\sum_{i = 0}^j \\left\\Vert x_{i + 1} - x_i \\right\\Vert\n",
    "    &\\leq \\left\\Vert x_1 - x_0 \\right\\Vert \\sum_{i = 0}^j \\alpha^i\\\\\n",
    "    &= \\left\\Vert x_1 - x_0 \\right\\Vert \\frac{1 - \\alpha^{j + 1}}{1 - \\alpha}\n",
    "       & \\quad & \\text{geometric series formula}\\\\\n",
    "    &\\leq \\frac{\\left\\Vert x_1 - x_0 \\right\\Vert}{1 - \\alpha}\n",
    "          & \\quad & \\text{as } j \\rightarrow \\infty.\n",
    "\n",
    "(b)\n",
    "---\n",
    "\n",
    "Recall that a sequence :math:`\\{ x_i \\}` is called a Cauchy sequence if\n",
    ":math:`\\forall \\epsilon > 0 \\, \\exists N \\in \\mathbb{N}` such that\n",
    ":math:`\\forall \\, m, n \\geq N`, then\n",
    ":math:`\\left\\Vert x_m - x_n \\right\\Vert < \\epsilon`.\n",
    "\n",
    "Without loss of generality, let :math:`m \\leq n` with :math:`k = n - m \\geq 0`.\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\Vert x_m - x_n \\right\\Vert\n",
    "    &= \\left\\Vert x_m - x_{m + k} \\right\\Vert\\\\\n",
    "    &= \\left\\Vert\n",
    "         (x_m - x_{m + 1}) + (x_{m + 1} - x_{m + 2}) + \\ldots +\n",
    "         (x_{m + k - 1} - x_{m + k})\n",
    "       \\right\\Vert\\\\\n",
    "    &\\leq \\left\\Vert x_{m + 1} - x_m \\right\\Vert +\n",
    "          \\left\\Vert x_{m + 2} - x_{m + 1} \\right\\Vert + \\ldots +\n",
    "          \\left\\Vert x_{m + k} - x_{m + k - 1} \\right\\Vert\n",
    "          & \\quad & \\text{triangle inequality (3.1.1b)}\\\\\n",
    "    &\\leq \\left\\Vert x_1 - x_0 \\right\\Vert \\sum_{i = m}^{m + k - 1} \\alpha^i\n",
    "          & \\quad & \\text{(a)}\\\\\n",
    "    &= \\left\\Vert x_1 - x_0 \\right\\Vert\n",
    "       \\frac{\\alpha^m - \\alpha^{m + k}}{1 - \\alpha}\n",
    "       & \\quad & \\text{geometric series formula}\\\\\n",
    "    &\\leq \\left\\Vert x_1 - x_0 \\right\\Vert \\frac{\\alpha^N}{1 - \\alpha}.\n",
    "\n",
    "Hence :math:`\\{ x_i \\}` is a Cauchy sequence as long as :math:`N` satisfies\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\Vert x_1 - x_0 \\right\\Vert \\frac{\\alpha^N}{1 - \\alpha}\n",
    "    &< \\epsilon\\\\\n",
    "   \\alpha^N &< \\epsilon \\frac{1 - \\alpha}{\\left\\Vert x_1 - x_0 \\right\\Vert}\\\\\n",
    "   N &> \\log_{\\alpha}\\left(\n",
    "          \\epsilon \\frac{1 - \\alpha}{\\left\\Vert x_1 - x_0 \\right\\Vert}\n",
    "        \\right).\n",
    "\n",
    "The following observations will be useful in proving :math:`G` has a fixpoint\n",
    ":math:`x_*` where a fixed point (a.k.a. fixpoint, invariant point) of a function\n",
    "is defined as :math:`G(x_*) = x_*`.\n",
    "\n",
    "A function is said to be a contraction map if it satisfies the assumptions of\n",
    "the Contractive Mapping Theorem.  Notice that the inequality (5.3.2) is a\n",
    "restricted definition of Lipschitz continuity, which means a contraction map is\n",
    "continuous on :math:`D`.\n",
    "\n",
    "As shown in (a), :math:`\\{ x_k \\}` converges to some :math:`x_*` because\n",
    ":math:`\\lim_{k \\rightarrow \\infty} \\alpha^k = 0`.  Therefore,\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\lim_{k \\rightarrow \\infty} x_{k + 1}\n",
    "    &= \\lim_{k \\rightarrow \\infty} G(x_k)\\\\\n",
    "    &= G(\\lim_{k \\rightarrow \\infty} x_k)\n",
    "       & \\quad & \\text{G is continuous}\\\\\n",
    "   x_* &= G(x_*).\n",
    "\n",
    "(c)\n",
    "---\n",
    "\n",
    "Proof by contradiction is the easily method to show that :math:`x_* \\in D` is\n",
    "unique.\n",
    "\n",
    "Assume that :math:`\\exists \\, y_* \\in D` such that :math:`y_* \\neq x_*`.\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\Vert G(x_*) - G(y_*) \\right\\Vert\n",
    "    &\\leq \\alpha \\left\\Vert x_* - y_* \\right\\Vert\n",
    "          & \\quad & \\text{(5.3.2)}\\\\\n",
    "   \\left\\Vert x_* - y_* \\right\\Vert\n",
    "    &\\leq \\alpha \\left\\Vert x_* - y_* \\right\\Vert\n",
    "          & \\quad & \\text{(b)}\\\\\n",
    "   (1 - \\alpha) \\left\\Vert x_* - y_* \\right\\Vert &\\leq 0.\n",
    "\n",
    "Since :math:`\\alpha \\in [0, 1)`, :math:`\\left\\Vert x_* - y_* \\right\\Vert = 0`,\n",
    "but this contradicts the assumption :math:`y_* \\neq x_*`.  Thus a contraction\n",
    "map has a unique fixed point.\n",
    "\n",
    "By definition\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\Vert x_{k + 1} - x_* \\right\\Vert =\n",
    "   \\left\\Vert G(x_k) - G(x_*) \\right\\Vert \\leq\n",
    "   \\alpha \\left\\Vert x_k - x_* \\right\\Vert,\n",
    "\n",
    ":math:`\\{ x_k \\}` converges q-linearly to :math:`x_*`.\n",
    "\n",
    "By applying the results of (b) with :math:`m = k` as\n",
    ":math:`n \\rightarrow \\infty`, the error bound can be derived as\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\lim_{n \\rightarrow \\infty} \\left\\Vert x_k - x_n \\right\\Vert\n",
    "    &\\leq \\lim_{n \\rightarrow \\infty} \\left\\Vert x_1 - x_0 \\right\\Vert\n",
    "          \\frac{\\alpha^k - \\alpha^{n}}{1 - \\alpha}\\\\\n",
    "   \\left\\Vert x_k - x_* \\right\\Vert\n",
    "    &\\leq \\frac{\\eta \\alpha^k}{1 - \\alpha}\n",
    "          & \\quad & \\left\\Vert x_1 - x_0 \\right\\Vert \\leq \\eta.\n",
    "\n",
    "The last inequality holds because the norm's continuity enables moving the\n",
    "limit inside.\n",
    "\n",
    "(d)\n",
    "---\n",
    "\n",
    "As shown in :cite:`petersd466fp`, the other version of the Contractive Mapping\n",
    "Theorem no longer assumes a self-map.\n",
    "\n",
    "Let :math:`G \\colon D \\rightarrow \\mathbb{R}^n` where :math:`D` is a closed\n",
    "subset of :math:`\\mathbb{R}^n`.  If for some norm\n",
    ":math:`\\left\\Vert \\cdot \\right\\Vert`, there exists :math:`\\alpha \\in [0, 1)`\n",
    "such that\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\Vert G(x) - G(y) \\right\\Vert \\leq \\alpha \\left\\Vert x - y \\right\\Vert\n",
    "   \\qquad \\forall \\, x, y \\in D,\n",
    "\n",
    "then the rest of the Contractive Mapping Theorem still holds.  The goal is to\n",
    "show the sequence :math:`\\{ x_k \\}` generated by :math:`x_{k + 1} = G(x_k)`\n",
    "remains in :math:`D` for :math:`k = 0, 1, \\ldots` with :math:`x_0 \\in D`.\n",
    "\n",
    "Since :math:`D` is a closed subset and the results of (c) defines a closed\n",
    "neighborhood of radius :math:`\\frac{\\eta \\alpha^k}{1 - \\alpha}` around\n",
    ":math:`x_*`, as long as each subsequent element in the sequence are within that\n",
    "iteration's specified neighborhood, the original assumptions of the Contractive\n",
    "Mapping Theorem are satisfied and the sequence will converge to :math:`x_*`."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 11\n",
    "===========\n",
    "\n",
    "Define :math:`x_{i + 1} = G(x_i) = x_i - \\frac{f(x_i)}{a}` where\n",
    ":math:`f(x) = x^2 - 1` and :math:`a > 1`.  By inspection, the roots of :math:`f`\n",
    "are at :math:`x = \\pm 1` and :math:`G \\colon \\mathbb{R} \\rightarrow \\mathbb{R}`,\n",
    "but :math:`x_i` will not converge to the roots\n",
    ":math:`\\forall x_0 \\in \\mathbb{R}`.  The goal is to find\n",
    ":math:`D \\in \\mathbb{R}` such that :math:`x_0 \\in D` will converge to\n",
    ":math:`x_* = 1` and\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\Vert G(x_0) - G(x_*) \\right\\Vert\n",
    "    &= \\left\\vert x_0 - \\frac{f(x_0)}{a} - x_*\\right\\vert\\\\\n",
    "    &= \\left\\vert x_0 - x_* - \\frac{x_0^2 - 1}{a} \\right\\vert\\\\\n",
    "    &= \\left\\vert x_0 - x_* \\right\\vert\n",
    "       \\left\\vert 1 - \\frac{x_0 + x_*}{a} \\right\\vert\n",
    "       & \\quad & x_* = 1\\\\\n",
    "    &< \\left\\vert x_0 - x_* \\right\\vert.\n",
    "\n",
    "In order to establish the last inequality, :math:`x_0` needs to satisfy\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\vert 1 - \\frac{x_0 + x_*}{a} \\right\\vert &< 1\\\\\n",
    "   1 - \\frac{2 (x_0 + x_*)}{a} + \\frac{(x_0 + x_*)^2}{a^2} &< 1\\\\\n",
    "   0 &< -(x_0 + x_*)^2 + 2a (x_0 + x_*)\\\\\n",
    "   0 &< -x_0^2 + (2a - 2x_*) x_0 + 2ax_* - x_*^2.\n",
    "\n",
    "Solving the quadratic equation for the :math:`x_0` yields\n",
    "\n",
    ".. math::\n",
    "\n",
    "   x_0 =\n",
    "   \\frac{\n",
    "     -(2a - 2x_*) \\pm \\sqrt{(2a - 2x_*)^2 - 4 (-1) (2ax_* - x_*^2)}\n",
    "   }{2 (-1)} =\n",
    "   a - x_* \\pm (-a).\n",
    "\n",
    "Hence the inequality will be satisfied i.e. :math:`\\exists \\, \\alpha \\in [0, 1)`\n",
    "such that :math:`\\left\\Vert G(x_0) - G(x_*) \\right\\Vert\n",
    "\\leq \\alpha \\left\\vert x_0 - x_* \\right\\vert` when\n",
    ":math:`x_0 \\in (-x_*, 2a - x_*)`.\n",
    "\n",
    "To show :math:`G \\colon D \\rightarrow D` holds by principle of induction, assume\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\Vert x_i - x_* \\right\\Vert =\n",
    "   \\left\\Vert G(x_{i - 1}) - G(x_*) \\right\\Vert \\leq\n",
    "   \\alpha \\left\\vert x_{i - 1} - x_* \\right\\vert.\n",
    "\n",
    "and\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\Vert G(x_i) - G(x_*) \\right\\Vert\n",
    "    &= \\left\\vert x_i - \\frac{f(x_i)}{a} - x_* \\right\\vert\\\\\n",
    "    &= \\left\\vert x_i - x_* - \\frac{(x_i + x_*) (x_i - x_*)}{a} \\right\\vert\n",
    "       & \\quad & x_* = 1\\\\\n",
    "    &= \\left\\vert x_i - x_* \\right\\vert\n",
    "       \\left\\vert 1 - \\frac{(x_i + x_*)}{a} \\right\\vert\\\\\n",
    "    &< \\left\\vert x_i - x_* \\right\\vert.\n",
    "\n",
    "Consequently,\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\vert 1 - \\frac{(x_i + x_*)}{a} \\right\\vert &< 1\\\\\n",
    "   1 - \\frac{2 (x_i + x_*)}{a} + \\frac{(x_i + x_*)^2}{a^2} &< 1\\\\\n",
    "   0 &< -(x_i + x_*)^2 + 2a (x_i + x_*)\\\\\n",
    "   0 &< -x_i^2 + (2a - 2x_*) x_i + 2ax_* - x_*^2\\\\\n",
    "   0 &< -\\left( x_{i - 1} - \\frac{x_{i - 1}^2 - 1}{a} \\right)^2 +\n",
    "        (2a - 2x_*) \\left( x_{i - 1} - \\frac{x_{i - 1}^2 - 1}{a} \\right) +\n",
    "        2ax_* - x_*^2\\\\\n",
    "   0 &< -\\frac{\\left( -ax_{i - 1} - ax_* + x_{i - 1}^2 - 1 \\right)\n",
    "        \\left( 2a^2 - ax_{i - 1} - ax_* + x_{i - 1}^2 - 1 \\right)}{a^2}\\\\\n",
    "   0 &< \\frac{ \\left( x_{i - 1} + 1 \\right) \\left( a - x_{i - 1} + 1 \\right)\n",
    "        \\left( 2a^2 - ax_{i - 1} - a + x_{i - 1}^2 - 1 \\right)}{a^2}\n",
    "        & \\quad & x_* = 1.\n",
    "\n",
    "Since :math:`a > 1`, :math:`\\left\\Vert G(x_i) - G(x_*) \\right\\Vert\n",
    "\\leq \\alpha \\left\\vert x_i - x_* \\right\\vert` when\n",
    ":math:`x_{i - 1} \\in (-1, a + 1)`.\n",
    "\n",
    "Thus defining :math:`\\delta = \\min\\left\\{ 2, a + 1, 2a - 1 \\right\\}` and\n",
    "applying the results of :ref:`Exercise 10d <dennis1996numerical-ex-5.10>` proves\n",
    "that the sequence :math:`\\{ x_i \\}` converges q-linearly to :math:`x_* = 1` with\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\lim_{i \\rightarrow \\infty}\n",
    "     \\left\\vert \\frac{x_{i + 1} - x_*}{x_i - x_*} \\right\\vert\n",
    "    &= \\lim_{i \\rightarrow \\infty}\n",
    "         \\left\\vert\n",
    "           \\frac{x_i - \\frac{x_i^2 - 1}{a} - x_*}{x_i - x_*}\n",
    "         \\right\\vert\\\\\n",
    "    &= \\lim_{i \\rightarrow \\infty}\n",
    "         \\left\\vert 1 - \\frac{x_i^2 - 1}{a (x_i - x_*)} \\right\\vert\\\\\n",
    "    &= \\lim_{i \\rightarrow \\infty} \\left\\vert 1 - \\frac{x_i + 1}{a} \\right\\vert\n",
    "       & \\quad &  x_* = 1\\\\\n",
    "    &= \\left\\vert 1 - \\lim_{i \\rightarrow \\infty} \\frac{x_i + 1}{a} \\right\\vert\n",
    "       & \\quad & \\text{absolute value is continuous}\\\\\n",
    "    &= \\left\\vert 1 - \\frac{2}{a} \\right\\vert\\\\\n",
    "    &> 0 & \\quad & \\text{unless } a = 2."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 12\n",
    "===========\n",
    "\n",
    "If :math:`\\lim_{k \\rightarrow \\infty} h_k = 0`, then the convergence is\n",
    "q-superlinear.\n",
    "\n",
    "The proof is similar to the one in the text up to applying Lemma 4.2.1.\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\Vert x_k - x_* \\right\\Vert\n",
    "    &\\leq \\left\\Vert A^{-1}_k \\right\\Vert\n",
    "          \\left(\n",
    "            \\left\\Vert F(x_*) - F(x_k) - J(x_k) (x_* - x_k) \\right\\Vert +\n",
    "            \\left\\Vert A_k - J(x_k) \\right\\Vert \\left\\Vert x_* - x_k \\right\\Vert\n",
    "          \\right)\\\\\n",
    "    &\\leq 2 \\beta\n",
    "          \\left(\n",
    "            \\frac{\\gamma}{2} \\left\\Vert x_* - x_k \\right\\Vert^2 +\n",
    "            \\frac{\\gamma \\left\\vert h_k \\right\\vert}{2}\n",
    "                \\left\\Vert x_* - x_k \\right\\Vert\n",
    "          \\right)\n",
    "          & \\quad & \\text{Lemma 4.2.1}\\\\\n",
    "    &= c_k \\left\\Vert x_* - x_k \\right\\Vert\n",
    "       & \\quad & c_k = \\beta \\gamma \\left( \\left\\Vert x_* - x_k \\right\\Vert +\n",
    "                       \\left\\vert h_k \\right\\vert \\right).\n",
    "\n",
    "Since :math:`\\lim_{k \\rightarrow \\infty} x_k = x_*` and by assumption\n",
    ":math:`\\lim_{k \\rightarrow \\infty} h_k = 0`,\n",
    ":math:`\\lim_{k \\rightarrow \\infty} c_k = 0` i.e.\n",
    ":math:`\\{ x_k \\}` converges q-superlinearly.\n",
    "\n",
    "Notice that the expressions (5.4.2) and (5.4.3) are equivalent via Lemma 4.1.16:\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\left\\vert h_k \\right\\vert\n",
    "    &\\leq c_2 \\left\\Vert F(x_k) \\right\\Vert\n",
    "           & \\quad & \\text{(5.4.3)}\\\\\n",
    "    &= c_2 \\left\\Vert F(x_k) - F(x_*) \\right\\Vert\n",
    "       & \\quad & F(x_*) = 0\\\\\n",
    "    &\\leq c_2 \\beta \\left\\Vert x_k - x_* \\right\\Vert\n",
    "          & \\quad & 4.1.16\\\\\n",
    "    &= c_1 \\left\\Vert x_k - x_* \\right\\Vert\n",
    "       & \\quad & c_1 = c_2 \\beta.\n",
    "\n",
    "To show q-quadratic convergence, the proof consists of applying the assumption\n",
    "that either (5.4.2) or (5.4.3) holds."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 13\n",
    "===========\n",
    "\n",
    "Given :math:`x_k = \\begin{pmatrix} 10^7 & 10^{-7} \\end{pmatrix}^\\top`,\n",
    "the analytical solution for :math:`J(x_k)` is\n",
    "\n",
    ".. math::\n",
    "\n",
    "   J(x_k) = \\nabla F(x_k)^\\top =\n",
    "   \\begin{bmatrix}\n",
    "     2 x_1 & 0\\\\\n",
    "     0 & 2 x_2\n",
    "   \\end{bmatrix}.\n",
    "\n",
    "Approximating :math:`J(x_k)` with (5.4.1) gives\n",
    "\n",
    ".. math::\n",
    "\n",
    "   A(x_k) =\n",
    "   \\frac{1}{h_k}\n",
    "     \\begin{bmatrix}\n",
    "       \\left( x_1 + h_k \\right)^2 - \\left( x_1 \\right)^2 &\n",
    "           \\left( x_1 \\right)^2 - \\left( x_1 \\right)^2\\\\\n",
    "       \\left( x_2 \\right)^2 - \\left( x_2 \\right)^2 &\n",
    "           \\left( x_2 + h_k \\right)^2 - \\left( x_2 \\right)^2\n",
    "     \\end{bmatrix} =\n",
    "   \\begin{bmatrix}\n",
    "     2 x_1 + h_k & 0\\\\\n",
    "     0 & 2 x_2 + h_k\n",
    "   \\end{bmatrix}.\n",
    "\n",
    "When :math:`h_k = 1`, the approximation\n",
    "\n",
    ".. math::\n",
    "\n",
    "   A(x_k) = \\begin{bmatrix} 2 x_1 + 1 & 0\\\\ 0 & 2 x_2 + 1 \\end{bmatrix}\n",
    "\n",
    "has a relative error of\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\frac{\n",
    "     \\left\\vert 2 x_1 - \\left( 2 x_1 + 1 \\right) \\right\\vert\n",
    "   }{\\left\\vert 2 x_1 \\right\\vert} =\n",
    "   \\frac{1}{2} 10^{-7}\n",
    "   \\quad \\text{and} \\quad\n",
    "   \\frac{\n",
    "     \\left\\vert 2 x_2 - \\left( 2 x_2 + 1 \\right) \\right\\vert\n",
    "   }{\\left\\vert 2 x_2 \\right\\vert} =\n",
    "   \\frac{1}{2} 10^7.\n",
    "\n",
    "When :math:`h_k = 10^{-14}`, the approximation\n",
    "\n",
    ".. math::\n",
    "\n",
    "   A(x_k) =\n",
    "   \\begin{bmatrix} 2 x_1 + 10^{-14} & 0\\\\ 0 & 2 x_2 + 10^{-14} \\end{bmatrix}\n",
    "\n",
    "has a relative error of\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\frac{\n",
    "     \\left\\vert 2 x_1 - \\left( 2 x_1 + 10^{-14} \\right) \\right\\vert\n",
    "   }{\\left\\vert 2 x_1 \\right\\vert} =\n",
    "   \\frac{1}{2} 10^{-21}\n",
    "   \\quad \\text{and} \\quad\n",
    "   \\frac{\n",
    "     \\left\\vert 2 x_2 - \\left( 2 x_2 + 10^{-14} \\right) \\right\\vert\n",
    "   }{\\left\\vert 2 x_2 \\right\\vert} =\n",
    "   \\frac{1}{2} 10^{-7}.\n",
    "\n",
    "Recall that a computer with :math:`14` base-:math:`10` digits has a machine\n",
    "epsilon :math:`\\eta \\approx 10^{-14}`.  The first approximation incorrectly\n",
    "computed :math:`J(x_k)_{22}` while the second approximation will experience\n",
    "underflow during the evaluation of :math:`A(x_k)_{22}`.  Setting\n",
    "\n",
    ".. math::\n",
    "\n",
    "   h_k =\n",
    "   \\text{sign}(x_k) \\sqrt{\\eta} \\max\\{ \\vert x_k \\vert, \\text{typ}x_k \\}\n",
    "\n",
    "is a good compromise."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 14\n",
    "===========\n",
    "\n",
    "Let :math:`p \\in \\mathbb{R}, p \\neq 0`.  The step size proposed in (5.4.10) is\n",
    "\n",
    ".. math::\n",
    "\n",
    "   h_j = \\sqrt{\\eta} \\max\\{ |x_j|, \\text{typ}x_j \\} \\text{sign}(x_j)\n",
    "\n",
    "(a)\n",
    "---\n",
    "\n",
    "Let :math:`\\hat{F} = f_1(x) = x^p` and\n",
    ":math:`\\gamma = f_1''(x) = p (p - 1) x^p`.  The optimal step size given in\n",
    "(5.4.13) is\n",
    "\n",
    ".. math::\n",
    "\n",
    "   h_* =\n",
    "   \\left( \\frac{4 (\\eta + \\epsilon)}{\\gamma} \\hat{F} \\right)^{1/2} =\n",
    "   \\left( \\frac{4 (\\eta + \\epsilon)}{p^2 - p} \\right)^{1/2} =\n",
    "   \\frac{2}{\\sqrt{p^2 - p}} \\sqrt{\\eta + \\epsilon}.\n",
    "\n",
    "(b)\n",
    "---\n",
    "\n",
    "Let :math:`\\hat{F} = f_2(x) = e^{px}` and\n",
    ":math:`\\gamma = f_2''(x) = p^2 e^{px}`.  The optimal step size given in\n",
    "(5.4.13) is\n",
    "\n",
    ".. math::\n",
    "\n",
    "   h_* =\n",
    "   \\left( \\frac{4 (\\eta + \\epsilon)}{\\gamma} \\hat{F} \\right)^{1/2} =\n",
    "   \\left( \\frac{4 (\\eta + \\epsilon)}{p^2} \\right)^{1/2} =\n",
    "   \\frac{2}{p} \\sqrt{\\eta + \\epsilon}.\n",
    "\n",
    "(c)\n",
    "---\n",
    "\n",
    "Both step sizes are scale independent, which is not ideal.\n",
    "\n",
    "(d)\n",
    "---\n",
    "\n",
    "The forward finite-difference approximation to :math:`f_2'(x) = p e^{px}` is\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\frac{f_2(x + h) - f_2(x)}{h} = e^{px} \\frac{e^{ph} - 1}{h}\n",
    "\n",
    "with a relative error of\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\frac{\n",
    "     \\left\\vert\n",
    "       p e^{px} - \\left( e^{px} \\frac{e^{ph} - 1}{h} \\right)\n",
    "     \\right\\vert\n",
    "   }{\n",
    "     \\left\\vert p e^{px} \\right\\vert\n",
    "   } =\n",
    "   \\left\\vert 1 - \\frac{e^{ph} - 1}{ph} \\right\\vert.\n",
    "\n",
    "Good step sizes will tend to make :math:`e^{ph} \\rightarrow 1`, so the quantity\n",
    "will either be :math:`0` or a very small constant compared to :math:`e^{px}`.\n",
    "This means when :math:`x` is a very large positive number, either overflow\n",
    "occurs or the approximation will be off as described by the relative error."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 15\n",
    "===========\n",
    "\n",
    "Lemma 4.3.1 asserts that a minimizer of\n",
    "\n",
    ".. math::\n",
    "\n",
    "   f(x) = f(x_1, x_2) = \\frac{1}{2} (x_1^2 - x_2)^2 + \\frac{1}{2} (1 - x_1)^2\n",
    "\n",
    "needs to satisfy\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\nabla f(x) =\n",
    "   \\begin{bmatrix}\n",
    "     2 x_1^3 - 2 x_1 x_2 + x_1 - 1\\\\\n",
    "     x_2 - x_1^2\n",
    "   \\end{bmatrix} = 0.\n",
    "\n",
    ":math:`x = [1, 1]` is one such minimizer.  By definition 4.1.4, the Hessian is\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\nabla^2 f(x) =\n",
    "   \\begin{bmatrix} 6 x_1^2 - 2 x_2 + 1 & -2 x_1\\\\ -2 x_1 & 1 \\end{bmatrix} = 0.\n",
    "\n",
    "Based on the results of :math:`f(x_0)` and :math:`f(x_1)`, it does seem like a\n",
    "good step."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import numpy\n",
    "\n",
    "def f(v):\n",
    "    x = v.flat\n",
    "    return 0.5 * (x[0]**2 - x[1])**2 + 0.5 * (1 - x[0])**2\n",
    "\n",
    "def fp(v):\n",
    "    x = v.flat\n",
    "    return numpy.asmatrix([2 * x[0]**3 - 2 * x[0] * x[1] + x[0] - 1,\n",
    "                           x[1] - x[0]**2]).T\n",
    "\n",
    "def fpp(v):\n",
    "    x = v.flat\n",
    "    return numpy.asmatrix([[6 * x[0]**2 - 2 * x[1] + 1, -2 * x[0]],\n",
    "                           [-2 * x[0], 1]])\n",
    "\n",
    "x_k = numpy.asmatrix([2, 2]).T\n",
    "for i in range(2):\n",
    "    print('x_{0}: {1}'.format(i, f(x_k)))\n",
    "    _ = x_k - numpy.linalg.inv(fpp(x_k)) * fp(x_k)\n",
    "    x_k = _"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 16\n",
    "===========\n",
    "\n",
    "Given :math:`f(x) = \\sin(x)`, :math:`f'(x) = \\cos(x)` and\n",
    ":math:`f''(x) = -\\sin(x)`.\n",
    "\n",
    "Applying Newton's method yields\n",
    "\n",
    ".. math::\n",
    "\n",
    "   x_1 = x_0 - \\frac{\\cos(x_0)}{-sin(x_0)} = x_0 + \\frac{1}{\\tan(x_0)}.\n",
    "\n",
    "Define :math:`0 < \\epsilon \\ll 1`.  When :math:`x_0 = -\\epsilon`,\n",
    "\n",
    ".. math::\n",
    "\n",
    "   x_1 &= -\\epsilon + \\frac{1}{\\tan(-\\epsilon)}\\\\\n",
    "    &\\cong -\\epsilon - \\frac{1}{\\epsilon}\n",
    "           & \\quad & \\text{small-angle approximation of tangent}\\\\\n",
    "    &= -\\frac{\\epsilon^2 + 1}{\\epsilon}\\\\\n",
    "    &\\cong -\\frac{1}{\\epsilon}.\n",
    "\n",
    "When :math:`x_0 = \\epsilon`,\n",
    "\n",
    ".. math::\n",
    "\n",
    "   x_1 &= \\epsilon - \\frac{1}{\\tan(\\epsilon)}\\\\\n",
    "    &\\cong \\epsilon - \\frac{1}{\\epsilon}\n",
    "           & \\quad & \\text{small-angle approximation of tangent}\\\\\n",
    "    &= \\frac{\\epsilon^2 - 1}{\\epsilon}\\\\\n",
    "    &\\cong -\\frac{1}{\\epsilon}."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 17\n",
    "===========\n",
    "\n",
    "(a)\n",
    "---\n",
    "\n",
    "The gradient and Hessian of :math:`f_1(x) = -x_1^2 - x_2^2` are respectively\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\nabla f_1(x) = \\begin{bmatrix} -2 x_1\\\\ -2 x_2 \\end{bmatrix}\n",
    "   \\quad \\text{and} \\quad\n",
    "   \\nabla^2 f_1(x) = \\begin{bmatrix} -2 & 0\\\\ 0 & -2 \\end{bmatrix}.\n",
    "\n",
    "Since :math:`\\nabla^2 f_1(x)` has an eigenvalue :math:`\\lambda = -2` of\n",
    "multiplicity :math:`2`, it is not positive definite.  This means\n",
    "\n",
    ".. math::\n",
    "\n",
    "   H_k = \\nabla^2 f_1(x) + \\mu_k I = (\\mu_k - 2) I\n",
    "   \\quad \\Rightarrow \\quad\n",
    "   H_k^{-1} = \\frac{1}{\\mu_k - 2} I\n",
    "\n",
    "and\n",
    "\n",
    ".. math::\n",
    "\n",
    "   x_{k + 1} = x_k - H_k^{-1}(x_k) \\nabla f_1(x_k) =\n",
    "   \\begin{bmatrix}\n",
    "     x_{k_1} + x_{k_1} \\left( \\frac{\\mu_k}{2} - 1 \\right)^{-1}\\\\\n",
    "     x_{k_2} + x_{k_2} \\left( \\frac{\\mu_k}{2} - 1 \\right)^{-1}\n",
    "   \\end{bmatrix}.\n",
    "\n",
    "From the assumption of algorithm A5.5.1, the denominator is positive.  Thus\n",
    "the algorithm will grow towards to :math:`\\pm \\infty` depending on\n",
    ":math:`\\text{sgn}(x_0)` unless :math:`x_0 = x_*`.\n",
    "\n",
    "(b)\n",
    "---\n",
    "\n",
    "The gradient and Hessian of :math:`f_2(x) = x_1^2 - x_2^2` are respectively\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\nabla f_2(x) = \\begin{bmatrix} 2 x_1\\\\ -2 x_2 \\end{bmatrix}\n",
    "   \\quad \\text{and} \\quad\n",
    "   \\nabla^2 f_2(x) = \\begin{bmatrix} 2 & 0\\\\ 0 & -2 \\end{bmatrix}.\n",
    "\n",
    "Since the eigenvalues of :math:`\\nabla^2 f_2(x)` are :math:`\\lambda_0 = 2` and\n",
    ":math:`\\lambda_1 = -2`, it is not positive definite.  This means\n",
    "\n",
    ".. math::\n",
    "\n",
    "   H_k = \\nabla^2 f_2(x) + \\mu_k I\n",
    "   \\quad \\Rightarrow \\quad\n",
    "   H_k^{-1} =\n",
    "   \\begin{bmatrix}\n",
    "     \\frac{1}{\\mu_k + 2} & 0\\\\\n",
    "     0 & \\frac{1}{\\mu_k -2}\n",
    "   \\end{bmatrix}\n",
    "\n",
    "and\n",
    "\n",
    ".. math::\n",
    "\n",
    "   x_{k + 1} = x_k - H_k^{-1}(x_k) \\nabla f_1(x_k) =\n",
    "   \\begin{bmatrix}\n",
    "     x_{k_1} - x_{k_1} \\left( \\frac{\\mu_k}{2} + 1 \\right)^{-1}\\\\\n",
    "     x_{k_2} + x_{k_2} \\left( \\frac{\\mu_k}{2} - 1 \\right)^{-1}\n",
    "   \\end{bmatrix}\n",
    "\n",
    "From the assumption of algorithm A5.5.1, the denominator is positive.  Thus\n",
    "the algorithm will grow towards to :math:`\\pm \\infty` depending on\n",
    ":math:`\\text{sgn}(x_0)` unless :math:`x_{0_2} = 0`."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 18\n",
    "==========="
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import numpy\n",
    "\n",
    "def f(x):\n",
    "    return 0.5 * (x[0]**2 - x[1])**2 + 0.5 * (1 - x[0])**2\n",
    "\n",
    "def fp(v):\n",
    "    x = v.flat\n",
    "    return numpy.asmatrix([2 * x[0]**3 - 2 * x[0] * x[1] + x[0] - 1,\n",
    "                           x[1] - x[0]**2]).T\n",
    "\n",
    "def fpp(v):\n",
    "    x = v.flat\n",
    "    return numpy.asmatrix([[6 * x[0]**2 - 2 * x[1] + 1, -2 * x[0]],\n",
    "                           [-2 * x[0], 1]])\n",
    "\n",
    "def afpp(v, root_eta):\n",
    "    x_c = v.flat\n",
    "    h = root_eta * x_c\n",
    "    e = [numpy.asfarray([1, 0]), numpy.asfarray([0, 1])]\n",
    "\n",
    "    A = numpy.asmatrix([[0.0, 0.0], [0.0, 0.0]])\n",
    "    for i in range(2):\n",
    "        for j in range(2):\n",
    "            A[i,j] = f(x_c + h[i] * e[i] + h[j] * e[j])\n",
    "            A[i,j] -= f(x_c + h[i] * e[i])\n",
    "            A[i,j] -= f(x_c + h[j] * e[j])\n",
    "            A[i,j] += f(x_c)\n",
    "            A[i,j] /= h[i] * h[j]\n",
    "    return A\n",
    "\n",
    "x_k = numpy.asmatrix([2.0, 2.0]).T\n",
    "\n",
    "solutions = [[], [], []]\n",
    "\n",
    "for s in range(3):\n",
    "    x_k = numpy.asmatrix([2.0, 2.0]).T\n",
    "    for i in range(8):\n",
    "        if 0 == s:\n",
    "            A = fpp(x_k)\n",
    "        elif 1 == s:\n",
    "            A = afpp(x_k, numpy.finfo(float).eps**(1/2))\n",
    "        else:\n",
    "            A = afpp(x_k, numpy.finfo(float).eps**(1/3))\n",
    "        _ = x_k - numpy.linalg.inv(A) * fp(x_k)\n",
    "        x_k = _\n",
    "        solutions[s].append(f(x_k).flat[0])\n",
    "\n",
    "row_layout = '{0:<25} {1:<25} {2:<25}'\n",
    "print(row_layout.format('Analytic', 'macheps^(1/2)', 'macheps^(1/3)'))\n",
    "for i in range(len(solutions[0])):\n",
    "    print(row_layout.format(solutions[0][i], solutions[1][i], solutions[2][i]))"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 19\n",
    "===========\n",
    "\n",
    "Recall that the Hessian of :math:`f \\colon \\mathbb{R}^n \\rightarrow \\mathbb{R}`\n",
    "is equal to the Jacobian of :math:`\\nabla f`.\n",
    "\n",
    ".. math::\n",
    "\n",
    "   x_+ - x_* &= x_c - A_c^{-1} g_c - x_*\\\\\n",
    "    &= A_c^{-1} [ A_c (x_c - x_*) - g_c ]\\\\\n",
    "    &= A_c^{-1}\n",
    "       \\left[\n",
    "         \\left( A_c - \\nabla^2 f(x_c) \\right) (x_c - x_*) - g_c +\n",
    "         \\nabla^2 f(x_c) (x_c - x_*)\n",
    "       \\right]\\\\\n",
    "    &= A_c^{-1}\n",
    "       \\left[\n",
    "         \\left( A_c - \\nabla^2 f(x_c) \\right) (x_c - x_*) -\n",
    "         (g_c - \\nabla f(x_c)) + \\nabla f(x_*) - \\nabla f(x_c) -\n",
    "         \\nabla^2 f(x_c) (x_* - x_c)\n",
    "       \\right]\\\\\n",
    "   \\left\\Vert x_+ - x_* \\right\\Vert\n",
    "    &= \\left\\Vert A_c^{-1}\n",
    "       \\left[\n",
    "         \\left( A_c - \\nabla^2 f(x_c) \\right) (x_c - x_*) -\n",
    "         (g_c - \\nabla f(x_c)) + \\nabla f(x_*) - \\nabla f(x_c) -\n",
    "         \\nabla^2 f(x_c) (x_* - x_c)\n",
    "       \\right] \\right\\Vert\\\\\n",
    "   \\left\\Vert x_+ - x_* \\right\\Vert\n",
    "    &\\leq \\left\\Vert A_c^{-1} \\right\\Vert\n",
    "          \\left[\n",
    "            \\left\\Vert A_c - \\nabla^2 f(x_c) \\right\\Vert\n",
    "            \\left\\Vert x_c - x_* \\right\\Vert +\n",
    "            \\left\\Vert -(g_c - \\nabla f(x_c)) \\right\\Vert +\n",
    "            \\left\\Vert\n",
    "              \\nabla f(x_*) - \\nabla f(x_c) - \\nabla^2 f(x_c) (x_* - x_c)\n",
    "            \\right\\Vert\n",
    "          \\right]\\\\\n",
    "   \\left\\Vert x_+ - x_* \\right\\Vert\n",
    "    &\\leq \\left\\Vert A_c^{-1} \\right\\Vert\n",
    "          \\left[\n",
    "            \\left\\Vert g_c - \\nabla f(x_c) \\right\\Vert +\n",
    "            \\left\\Vert A_c - \\nabla^2 f(x_c) \\right\\Vert\n",
    "                \\left\\Vert e_c \\right\\Vert +\n",
    "            \\frac{\\gamma}{2} \\left\\Vert e_c^2 \\right\\Vert\n",
    "          \\right]\n",
    "          & \\quad & \\text{Lemma 4.1.12.}"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 20\n",
    "===========\n",
    "\n",
    "A5.6.3, A5.6.4, and A5.6.2 uses :math:`n + 1`, :math:`2n`, and\n",
    ":math:`1 + 2n + n^2 / 2` function evaluations respectively.  The overlap between\n",
    "each of these algorithms is at most :math:`n` function evaluations.  Therefore,\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\text{A5.6.3} + \\text{A5.6.2} =\n",
    "   (n + 1) + (1 + 2n + n^2 / 2) - n = 2 + 2n + n^2 / 2\n",
    "\n",
    "and\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\text{A5.6.4} + \\text{A5.6.2} =\n",
    "   (2n) + (1 + 2n + n^2 / 2) - n = 1 + 3n + n^2 / 2.\n",
    "\n",
    "The combined algorithm may yield answers with less accuracy and possibly lose\n",
    "cache locality due to more complex code."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    ".. rubric:: References\n",
    "\n",
    ".. bibliography:: chapter-05.bib"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "celltoolbar": "Raw Cell Format",
  "kernelspec": {
   "display_name": "Python [default]",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}