{
 "cells": [
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "****************\n",
    "Models for Shape\n",
    "****************\n",
    "\n",
    "Snakes/Active Contour Models\n",
    "==============================\n",
    "\n",
    "- Starts with a circular configuration and adapts to the shape.\n",
    "- Useful when there is little prior information about the object.\n",
    "\n",
    "Shape Templates\n",
    "===============\n",
    "\n",
    "- Starts with the correct shape of the object.\n",
    "- Want to estimate the parameters of the transformation that maps the shape onto\n",
    "  the current image.\n",
    "- Cannot be fit in closed form because we do not know which edge points in the\n",
    "  image correspond to each landmark point in the model.\n",
    "\n",
    "  - Iterative closest point algorithm alternatively matches points in the\n",
    "    image to the landmark points and computes the best closed form\n",
    "    transformation.\n",
    "\n",
    "Statistical Shape Models/Active Shape Models/Point Distribution Models\n",
    "==========================================================================\n",
    "\n",
    "- Describe the variation within a class of objects.\n",
    "- Capable of adapting to novel individual shapes from that class.\n",
    "- Generalized procrustes analysis alternatively optimizes the mean shape and the\n",
    "  parameters of the transformations that best maps the observed data points to\n",
    "  this mean.\n",
    "\n",
    "Non-Gaussian Statistical Shape Models\n",
    "=====================================\n",
    "\n",
    "- Gaussian process latent variable model (GPLVM) extends the PPCA model so that\n",
    "  the hidden variables are transformed through a fixed nonlinearity before being\n",
    "  weighted by the basis functions."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 17.1\n",
    "=============\n",
    "\n",
    ":math:`\\mathbf{C}_2` is a singular matrix."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib as mpl\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "def conicMatrix(v, xlim=[-2, 2], ylim=[-2, 2], xlab='x', ylab='y', title='', col='k'):\n",
    "    x = np.linspace(*xlim, 200)\n",
    "    y = np.linspace(*ylim, 200)\n",
    "    x, y = np.meshgrid(x, y)\n",
    "\n",
    "    _ = (v[0] * x**2 + v[1] * x * y + v[2] * y**2 + v[3] * x + v[4] * y + v[5])\n",
    "    plt.contour(x, y, _, [0], colors=col)\n",
    "    plt.xlim(*xlim)\n",
    "    plt.ylim(*ylim)\n",
    "    plt.xlabel(xlab)\n",
    "    plt.ylabel(ylab)\n",
    "    plt.title(title)\n",
    "    plt.axis('equal')\n",
    "    plt.show()\n",
    "\n",
    "conicMatrix([3, 0, 2, 0, 0, -1], xlim=[-1, 1], ylim=[-1, 1], title='Ellipse', col='red')\n",
    "conicMatrix([0, 0, 0, 1 * 2, 0, -2], title='Vertical Line', col='green')\n",
    "conicMatrix([-1, 0, 0, 0, 1 * 2, 0], xlim=[-4, 4], ylim=[-1, 9], title='Parabola', col='blue')"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 17.2\n",
    "=============\n",
    "\n",
    "The distance transform computes for each pixel the distance to the nearest\n",
    "background pixel whose value is zero.  To compute the distance to the nearest\n",
    "non-zero element of the original binary image, apply the same algorithm to the\n",
    "negated binary image."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import numpy\n",
    "\n",
    "def city_block_distance(grid):\n",
    "    dt = numpy.zeros_like(grid)\n",
    "    max_distance = grid.shape[0] * grid.shape[1]\n",
    "\n",
    "    #top-left corner of matrix in numpy\n",
    "    for i in range(grid.shape[0]):\n",
    "        for j in range(grid.shape[1]):\n",
    "            #already at boundary\n",
    "            if 0 == grid[i,j]:\n",
    "                continue\n",
    "\n",
    "            #not at boundary, so initially assume maximum distance to boundary\n",
    "            dt[i,j] = max_distance\n",
    "            #examine west and north neighbors\n",
    "            for k, l in [(i - 1, j), (i, j - 1)]:\n",
    "                if 0 <= k < grid.shape[0] and 0 <= l < grid.shape[1]:\n",
    "                    dt[i,j] = min(1 + dt[k,l], dt[i,j])\n",
    "                else:\n",
    "                    #values outside grid are zero\n",
    "                    dt[i,j] = min(numpy.inf, dt[i,j])\n",
    "    print('Result from forward pass:\\n{}'.format(dt))\n",
    "\n",
    "    #bottom-right corner of matrix in numpy\n",
    "    for i in range(grid.shape[0] - 1, -1, -1):\n",
    "        for j in range(grid.shape[1] - 1, -1, -1):\n",
    "            #already at boundary\n",
    "            if 0 == dt[i,j]:\n",
    "                continue\n",
    "\n",
    "            #examine east and south neighbors\n",
    "            for k, l in [(i + 1, j), (i, j + 1)]:\n",
    "                if 0 <= k < grid.shape[0] and 0 <= l < grid.shape[1]:\n",
    "                    dt[i,j] = min(1 + dt[k,l], dt[i,j])\n",
    "                else:\n",
    "                    #values outside grid are zero\n",
    "                    dt[i,j] = min(numpy.inf, dt[i,j])\n",
    "    print('Result from backward pass:\\n{}'.format(dt))\n",
    "\n",
    "grid = numpy.asarray([[0, 0, 0, 0, 1, 0, 0],\n",
    "                      [0, 0, 1, 1, 1, 0, 0],\n",
    "                      [0, 1, 1, 1, 1, 1, 0],\n",
    "                      [0, 1, 1, 1, 1, 1, 0],\n",
    "                      [0, 1, 1, 1, 0, 0, 0],\n",
    "                      [0, 0, 1, 0, 0, 0, 0],\n",
    "                      [0, 0, 0, 0, 0, 0, 0]])\n",
    "city_block_distance(1 - grid)"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 17.3\n",
    "=============\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\text{curve}[\\mathbf{w}, n]\n",
    "    &= -\\left(\n",
    "         \\mathbf{w}_{n - 1} - 2 \\mathbf{w}_n + \\mathbf{w}_{n + 1}\n",
    "       \\right)^\\top\n",
    "       \\left(\n",
    "         \\mathbf{w}_{n - 1} - 2 \\mathbf{w}_n + \\mathbf{w}_{n + 1}\n",
    "       \\right)\\\\\n",
    "   \\text{curve}[\\mathbf{w}, 2]\n",
    "    &= -\\left(\n",
    "         \\mathbf{w}_1 - 2 \\mathbf{w}_2 + \\mathbf{w}_3\n",
    "       \\right)^\\top\n",
    "       \\left(\n",
    "         \\mathbf{w}_1 - 2 \\mathbf{w}_2 + \\mathbf{w}_3\n",
    "       \\right)\\\\\n",
    "    &= \\begin{bmatrix}\n",
    "         100 - 2x + 200 & 100 - 2y + 300\n",
    "       \\end{bmatrix}\n",
    "       \\begin{bmatrix}\n",
    "         100 - 2x + 200\\\\\n",
    "         100 - 2y + 300\n",
    "       \\end{bmatrix}\\\\\n",
    "    &= (300 - 2x)^2 + (400 - 2y)^2\n",
    "\n",
    "By inspection, the point at the minimum is\n",
    ":math:`\\mathbf{w}_2 = \\begin{bmatrix} 150 & 200 \\end{bmatrix}^\\top`."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 17.4\n",
    "=============\n",
    "\n",
    "Since the image is a single color, there are no edge pixels so the distance\n",
    "transform will return :math:`\\infty` for each pixel.  This reduces the\n",
    "likelihood in (17.7) to a constant, so only the prior information will affect\n",
    "the snakes model.  The prior coefficients can be set to make the points spread\n",
    "out (or collapse) smoothly (or irregularly)."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 17.5\n",
    "=============\n",
    "\n",
    "Suppose the center of the shape could be computed from the edge information.\n",
    "(17.5) could use the averaged distance to the center from each landmark,\n",
    "which is essentially automating (17.8)."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 17.6\n",
    "=============\n",
    "\n",
    "Recall that :math:`\\mathbf{w} \\in \\mathbb{R}^{2N}` is a compound vector\n",
    "containing all the of the :math:`x`- and :math:`y`-positions of the landmark\n",
    "points in an image.\n",
    "\n",
    "We could take a maximum likelihood approach and optimize (17.23) with respect\n",
    "to :math:`\\mathbf{h}`.  This is equivalent to a least square fitting as shown in\n",
    "(4.14)."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 17.7\n",
    "=============\n",
    "\n",
    "Given :math:`\\mathbf{W} = \\mathbf{U} \\mathbf{L} \\mathbf{V}^\\top` where\n",
    ":math:`\\mathbf{U}, \\mathbf{V}` are orthogonal matrices and :math:`\\mathbf{L}`\n",
    "is a diagonal matrix,\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\mathbf{W} \\mathbf{W}^\\top =\n",
    "     \\mathbf{U} \\mathbf{L} \\mathbf{V}^\\top\n",
    "         \\left(\\mathbf{U} \\mathbf{L} \\mathbf{V}^\\top\\right)^\\top =\n",
    "     \\mathbf{U} \\mathbf{L} \\mathbf{V}^\\top \\mathbf{V}\n",
    "         \\mathbf{L} \\mathbf{U}^\\top =\n",
    "     \\mathbf{U} \\mathbf{L}^2 \\mathbf{U}^\\top\\\\\\\\\n",
    "   \\mathbf{W}^\\top \\mathbf{W} =\n",
    "     \\left(\\mathbf{U} \\mathbf{L} \\mathbf{V}^\\top\\right)^\\top\n",
    "         \\mathbf{U} \\mathbf{L} \\mathbf{V}^\\top =\n",
    "     \\mathbf{V} \\mathbf{L} \\mathbf{U}^\\top \\mathbf{U}\n",
    "         \\mathbf{L} \\mathbf{V}^\\top =\n",
    "     \\mathbf{V} \\mathbf{L}^2 \\mathbf{V}^\\top."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 17.8\n",
    "=============\n",
    "\n",
    "Let :math:`\\boldsymbol{\\theta} =\n",
    "\\left\\{ \\boldsymbol{\\mu}, \\boldsymbol{\\Phi}, \\sigma^2 \\right\\}`.\n",
    "\n",
    "(a)\n",
    "---\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\DeclareMathOperator{\\NormDist}{Norm}\n",
    "   q_i(\\mathbf{h}_i)\n",
    "    &= Pr(\\mathbf{h}_i \\mid \\mathbf{w}_i, \\boldsymbol{\\theta})\\\\\n",
    "    &= \\frac{\n",
    "         Pr(\\mathbf{w}_i \\mid \\mathbf{h}_i, \\boldsymbol{\\theta})\n",
    "         Pr(\\mathbf{h}_i)\n",
    "       }{\n",
    "         Pr(\\mathbf{w}_i \\mid \\boldsymbol{\\theta})\n",
    "       }\n",
    "       & \\quad & \\text{(7.50)}\\\\\n",
    "    &= \\frac{\n",
    "         \\NormDist_{\\mathbf{w}_i}\\left[\n",
    "           \\boldsymbol{\\mu} +\n",
    "           \\boldsymbol{\\Phi} \\mathbf{h}_i, \\sigma^2 \\mathbf{I}_{2N}\n",
    "         \\right]\n",
    "         \\NormDist_{\\mathbf{h}_i}\\left[ \\boldsymbol{0}, \\mathbf{I}_k \\right]\n",
    "       }{\n",
    "         \\NormDist_{\\mathbf{w}_i}\\left[\n",
    "           \\boldsymbol{\\mu},\n",
    "           \\boldsymbol{\\Phi} \\boldsymbol{\\Phi}^\\top + \\sigma^2 \\mathbf{I}_{2N}\n",
    "         \\right]\n",
    "       }\n",
    "       & \\quad & \\text{(17.23), (17.24), and (17.25)}\\\\\n",
    "    &= \\NormDist_{\\mathbf{h}_i} \\left[\n",
    "         \\hat{\\boldsymbol{\\mu}}, \\hat{\\boldsymbol{\\Sigma}}\n",
    "       \\right]\n",
    "       & \\quad & \\text{(a.1)}\n",
    "\n",
    "(b)\n",
    "---\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\DeclareMathOperator*{\\argmax}{argmax}\n",
    "   \\DeclareMathOperator{\\E}{\\mathrm{E}}\n",
    "   \\DeclareMathOperator{\\tr}{\\mathrm{tr}}\n",
    "   \\boldsymbol{\\theta}^{[t]}\n",
    "    &= \\argmax_\\boldsymbol{\\theta} \\left[\n",
    "         \\sum_{i = 1}^I \\int q_i^{[t]}(\\mathbf{h}_i)\n",
    "           \\log Pr(\\mathbf{w}_i, \\mathbf{h}_i \\mid \\boldsymbol{\\theta})\n",
    "           d\\mathbf{h}_i\n",
    "       \\right]\n",
    "       & \\quad & \\text{(7.51)}\\\\\n",
    "    &= \\argmax_\\boldsymbol{\\theta} \\sum_{i = 1}^I \\E\\left[\n",
    "         \\log Pr(\\mathbf{w}_i \\mid \\mathbf{h}_i, \\boldsymbol{\\theta})\n",
    "       \\right]\n",
    "       & \\quad & \\text{(7.36)}\\\\\n",
    "    &= \\argmax_\\boldsymbol{\\theta} \\sum_{i = 1}^I -\\frac{1}{2} \\E\\left[\n",
    "         2N \\log(2 \\pi) + \\log \\left\\vert \\sigma^2 \\mathbf{I}_{2N} \\right\\vert +\n",
    "         \\left(\n",
    "           \\mathbf{w}_i - \\boldsymbol{\\mu} - \\boldsymbol{\\Phi} \\mathbf{h}_i\n",
    "         \\right)^\\top\n",
    "           \\left( \\sigma^2 \\mathbf{I}_{2N} \\right)^{-1}\n",
    "           \\left(\n",
    "             \\mathbf{w}_i - \\boldsymbol{\\mu} - \\boldsymbol{\\Phi} \\mathbf{h}_i\n",
    "           \\right)\n",
    "       \\right]\n",
    "       & \\quad & \\text{(7.37)}\\\\\n",
    "    &= \\argmax_\\boldsymbol{\\theta} \\sum_{i = 1}^I \\E\\left[\n",
    "         -N \\log 2 \\pi - 2N \\log \\sigma -\n",
    "         \\frac{1}{2} \\sigma^{-2} \\left(\n",
    "           \\mathbf{w}_i - \\boldsymbol{\\mu} - \\boldsymbol{\\Phi} \\mathbf{h}_i\n",
    "         \\right)^\\top\n",
    "           \\left(\n",
    "             \\mathbf{w}_i - \\boldsymbol{\\mu} - \\boldsymbol{\\Phi} \\mathbf{h}_i\n",
    "           \\right)\n",
    "       \\right]\\\\\n",
    "    &= \\argmax_\\boldsymbol{\\theta} \\sum_{i = 1}^I \\E\\left[\n",
    "         -N \\log 2 \\pi - 2N \\log \\sigma -\n",
    "         \\frac{1}{2} \\sigma^{-2} \\left(\n",
    "           \\mathbf{w}_i^\\top \\mathbf{w}_i +\n",
    "           \\boldsymbol{\\mu}^\\top \\boldsymbol{\\mu} +\n",
    "           \\mathbf{h}_i^\\top \\boldsymbol{\\Phi}^\\top\n",
    "             \\boldsymbol{\\Phi} \\mathbf{h}_i +\n",
    "           2 \\boldsymbol{\\mu}^\\top \\boldsymbol{\\Phi} \\mathbf{h}_i -\n",
    "           2 \\mathbf{w}_i^\\top \\boldsymbol{\\mu} -\n",
    "           2 \\mathbf{w}_i^\\top \\boldsymbol{\\Phi} \\mathbf{h}_i\n",
    "         \\right)\n",
    "       \\right]\\\\\n",
    "    &= \\argmax_\\boldsymbol{\\theta} L(\\boldsymbol{\\theta})\n",
    "\n",
    "The following relations are useful for (b.1), (b.2), (b.3):\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\E\\left[ \\mathbf{h}_i \\right] &= \\hat{\\boldsymbol{\\mu}}\\\\\\\\\n",
    "   \\E\\left[ \\mathbf{h}_i \\mathbf{h}_i^\\top \\right]\n",
    "    &= \\hat{\\boldsymbol{\\Sigma}} +\n",
    "       \\E\\left[ \\mathbf{h}_i \\right] \\E\\left[ \\mathbf{h}_i \\right]^\\top\n",
    "     = \\hat{\\boldsymbol{\\Sigma}} +\n",
    "       \\hat{\\boldsymbol{\\mu}} \\hat{\\boldsymbol{\\mu}}^\\top.\n",
    "\n",
    "The SVD solution yields the same mean but different principal components.\n",
    "\n",
    "(a.1)\n",
    "-----\n",
    "\n",
    "Substitute in the corresponding values into\n",
    ":ref:`Exercise 7.9 <prince2012computer-ex-7.9>` to give\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\hat{\\boldsymbol{\\Sigma}}\n",
    "    &= \\left(\n",
    "         \\boldsymbol{\\Phi}^\\top \\boldsymbol{\\Sigma}^{-1} \\boldsymbol{\\Phi} +\n",
    "         \\mathbf{I}_k\n",
    "       \\right)^{-1}\\\\\n",
    "    &= \\left(\n",
    "         \\sigma^{-2} \\boldsymbol{\\Phi}^\\top \\boldsymbol{\\Phi} + \\mathbf{I}_k\n",
    "       \\right)^{-1}\n",
    "\n",
    "and\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\hat{\\boldsymbol{\\mu}}\n",
    "    &= \\hat{\\boldsymbol{\\Sigma}} \\boldsymbol{\\Phi}^\\top \\boldsymbol{\\Sigma}^{-1}\n",
    "           (\\mathbf{w}_i - \\boldsymbol{\\mu})\\\\\n",
    "    &= \\sigma^{-2} \\left(\n",
    "         \\sigma^{-2} \\boldsymbol{\\Phi}^\\top \\boldsymbol{\\Phi} + \\mathbf{I}_k\n",
    "       \\right)^{-1} \\boldsymbol{\\Phi}^\\top (\\mathbf{w}_i - \\boldsymbol{\\mu})\\\\\n",
    "    &= \\left(\n",
    "         \\boldsymbol{\\Phi}^\\top \\boldsymbol{\\Phi} + \\sigma^2 \\mathbf{I}_k\n",
    "       \\right)^{-1} \\boldsymbol{\\Phi}^\\top (\\mathbf{w}_i - \\boldsymbol{\\mu}).\n",
    "\n",
    "(b.1)\n",
    "-----\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\frac{\\partial L}{\\partial \\boldsymbol{\\mu}}\n",
    "    &= \\sum_{i = 1}^I \\E\\left[\n",
    "         -\\frac{1}{2} \\sigma^{-2} \\left(\n",
    "           2 \\boldsymbol{\\mu} +\n",
    "           2 \\boldsymbol{\\Phi} \\mathbf{h}_i -\n",
    "           2 \\mathbf{w}_i\n",
    "         \\right)\n",
    "       \\right]\n",
    "       & \\quad & \\text{(C.27), (C.28), (C.33)}\\\\\n",
    "   0 &= \\sum_{i = 1}^I\n",
    "          -\\boldsymbol{\\mu} -\n",
    "          \\boldsymbol{\\Phi} \\E[\\mathbf{h}_i] +\n",
    "          \\mathbf{w}_i\n",
    "        & \\quad & \\E[] \\text{ is a linear operator}\\\\\n",
    "    &= \\sum_{i = 1}^I\n",
    "         -\\boldsymbol{\\mu} -\n",
    "         \\boldsymbol{\\Phi} \\left[\n",
    "           \\left(\n",
    "             \\boldsymbol{\\Phi}^\\top \\boldsymbol{\\Phi} + \\sigma^2 \\mathbf{I}_k\n",
    "           \\right)^{-1}\n",
    "           \\boldsymbol{\\Phi}^\\top\n",
    "           \\left( \\mathbf{w}_i - \\boldsymbol{\\mu} \\right)\n",
    "         \\right] +\n",
    "         \\mathbf{w}_i\n",
    "       & \\quad & \\E[\\mathbf{h}_i] = \\hat{\\boldsymbol{\\mu}}\\\\\n",
    "    &= \\sum_{i = 1}^I\n",
    "         \\left(\n",
    "           \\mathbf{I}_{2N} -\n",
    "           \\boldsymbol{\\Phi}\n",
    "             \\left(\n",
    "               \\boldsymbol{\\Phi}^\\top \\boldsymbol{\\Phi} + \\sigma^2 \\mathbf{I}_k\n",
    "             \\right)^{-1}\n",
    "             \\boldsymbol{\\Phi}^\\top\n",
    "         \\right) \\mathbf{w}_i -\n",
    "         \\left(\n",
    "           \\mathbf{I}_{2N} -\n",
    "           \\boldsymbol{\\Phi}\n",
    "             \\left(\n",
    "               \\boldsymbol{\\Phi}^\\top \\boldsymbol{\\Phi} + \\sigma^2 \\mathbf{I}_k\n",
    "             \\right)^{-1}\n",
    "             \\boldsymbol{\\Phi}^\\top\n",
    "         \\right) \\boldsymbol{\\mu}\\\\\n",
    "    &= \\sum_{i = 1}^I\n",
    "         \\left(\n",
    "           \\mathbf{I}_{2N} +\n",
    "           \\sigma^{-2} \\boldsymbol{\\Phi} \\boldsymbol{\\Phi}^\\top\n",
    "         \\right)^{-1} \\mathbf{w}_i -\n",
    "         \\left(\n",
    "           \\mathbf{I}_{2N} +\n",
    "           \\sigma^{-2} \\boldsymbol{\\Phi} \\boldsymbol{\\Phi}^\\top\n",
    "         \\right)^{-1} \\boldsymbol{\\mu}\n",
    "       & \\quad & \\text{Sherman-Morrison-Woodbury formula}\\\\\n",
    "   \\boldsymbol{\\mu} &= \\frac{1}{I} \\sum_{i = 1}^I \\mathbf{w}_i\n",
    "\n",
    "(b.2)\n",
    "-----\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\frac{\\partial L}{\\partial \\boldsymbol{\\Phi}}\n",
    "    &= \\sum_{i = 1}^I \\E\\left[\n",
    "         -\\frac{1}{2} \\sigma^{-2} \\left(\n",
    "           2 \\boldsymbol{\\Phi} \\mathbf{h}_i \\mathbf{h}_i^\\top +\n",
    "           2 \\boldsymbol{\\mu} \\mathbf{h}_i^\\top -\n",
    "           2 \\mathbf{w}_i \\mathbf{h}_i^\\top\n",
    "         \\right)\n",
    "       \\right]\n",
    "       & \\quad & \\text{(C.31), (C.29)}\\\\\n",
    "   0 &= \\sum_{i = 1}^I\n",
    "          -\\boldsymbol{\\Phi} \\E\\left[ \\mathbf{h}_i \\mathbf{h}_i^\\top \\right] -\n",
    "          \\boldsymbol{\\mu} \\E\\left[ \\mathbf{h}_i^\\top \\right] +\n",
    "          \\mathbf{w}_i \\E\\left[ \\mathbf{h}_i^\\top \\right]\n",
    "        & \\quad & \\E[] \\text{ is a linear operator}\\\\\n",
    "   \\boldsymbol{\\Phi}\n",
    "    &= \\left(\n",
    "         \\sum_{i = 1}^I\n",
    "           (\\mathbf{w}_i - \\boldsymbol{\\mu}) \\E\\left[ \\mathbf{h}_i^\\top \\right]\n",
    "       \\right)\n",
    "       \\left(\n",
    "         \\sum_{i = 1}^I \\E\\left[ \\mathbf{h}_i \\mathbf{h}_i^\\top \\right]\n",
    "       \\right)^{-1}\n",
    "\n",
    "(b.3)\n",
    "-----\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\frac{\\partial L}{\\partial \\sigma}\n",
    "    &= \\sum_{i = 1}^I \\E\\left[\n",
    "         -2N\\sigma^{-1} +\n",
    "         \\sigma^{-3} \\left(\n",
    "           \\mathbf{w}_i^\\top \\mathbf{w}_i +\n",
    "           \\boldsymbol{\\mu}^\\top \\boldsymbol{\\mu} +\n",
    "           \\mathbf{h}_i^\\top \\boldsymbol{\\Phi}^\\top\n",
    "               \\boldsymbol{\\Phi} \\mathbf{h}_i +\n",
    "           2 \\boldsymbol{\\mu}^\\top \\boldsymbol{\\Phi} \\mathbf{h}_i -\n",
    "           2 \\mathbf{w}_i^\\top \\boldsymbol{\\mu} -\n",
    "           2 \\mathbf{w}_i^\\top \\boldsymbol{\\Phi} \\mathbf{h}_i\n",
    "         \\right)\n",
    "       \\right]\\\\\n",
    "   \\sigma^2\n",
    "    &= \\frac{1}{2NI} \\sum_{i = 1}^I\n",
    "         (\\mathbf{w}_i - \\boldsymbol{\\mu})^\\top\n",
    "             (\\mathbf{w}_i - \\boldsymbol{\\mu}) +\n",
    "         \\E\\left[\n",
    "           \\mathbf{h}_i^\\top \\boldsymbol{\\Phi}^\\top\n",
    "           \\boldsymbol{\\Phi} \\mathbf{h}_i\n",
    "         \\right] -\n",
    "         \\E\\left[\n",
    "           \\left( \\mathbf{w}_i - \\boldsymbol{\\mu} \\right)^\\top\n",
    "           \\boldsymbol{\\Phi} \\mathbf{h}_i\n",
    "         \\right]\n",
    "       & \\quad & \\E[] \\text{ is a linear operator}\\\\\n",
    "    &= \\frac{1}{2NI} \\sum_{i = 1}^I\n",
    "         (\\mathbf{w}_i - \\boldsymbol{\\mu})^\\top\n",
    "             (\\mathbf{w}_i - \\boldsymbol{\\mu}) +\n",
    "         \\tr\\left(\n",
    "           \\boldsymbol{\\Phi}^\\top \\boldsymbol{\\Phi} \\hat{\\boldsymbol{\\Sigma}}\n",
    "         \\right) +\n",
    "         \\hat{\\boldsymbol{\\mu}}^\\top \\boldsymbol{\\Phi}^\\top\n",
    "             \\boldsymbol{\\Phi} \\hat{\\boldsymbol{\\mu}} -\n",
    "         \\left( \\mathbf{w}_i - \\boldsymbol{\\mu} \\right)^\\top\n",
    "             \\boldsymbol{\\Phi} \\hat{\\boldsymbol{\\mu}}\n",
    "       & \\quad & \\text{Matrix Cookbook (328), (330)}"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 17.9\n",
    "=============\n",
    "\n",
    ".. math::\n",
    "\n",
    "   L &= \\sum_{n = 1}^N -\\sigma^{-2}\n",
    "          \\left(\n",
    "            \\mathbf{y}_n - \\mathbf{A} \\mathbf{w}_n - \\mathbf{b}\n",
    "          \\right)^\\top\n",
    "          \\left( \\mathbf{y}_n - \\mathbf{A} \\mathbf{w}_n - \\mathbf{b} \\right) -\n",
    "          \\log \\mathbf{h}^\\top \\mathbf{h}\\\\\n",
    "    &= \\sum_{n = 1}^N -\\sigma^{-2} \\left(\n",
    "         \\mathbf{y}_n^\\top \\mathbf{y}_n -\n",
    "         \\mathbf{y}_n^\\top \\mathbf{A} \\mathbf{w}_n -\n",
    "         \\mathbf{y}_n^\\top \\mathbf{b} -\n",
    "         \\mathbf{w}_n^\\top \\mathbf{A}^\\top \\mathbf{y}_n +\n",
    "         \\mathbf{w}_n^\\top \\mathbf{A}^\\top \\mathbf{A} \\mathbf{w}_n +\n",
    "         \\mathbf{w}_n^\\top \\mathbf{A}^\\top \\mathbf{b} -\n",
    "         \\mathbf{b}^\\top \\mathbf{y}_n +\n",
    "         \\mathbf{b}^\\top \\mathbf{A} \\mathbf{w}_n +\n",
    "         \\mathbf{b}^\\top \\mathbf{b}\n",
    "       \\right) -\n",
    "       \\log \\mathbf{h}^\\top \\mathbf{h}\\\\\n",
    "    &= \\sum_{n = 1}^N -\\sigma^{-2} \\left(\n",
    "         \\mathbf{y}_n^\\top \\mathbf{y}_n -\n",
    "         2 \\mathbf{y}_n^\\top \\mathbf{A} \\mathbf{w}_n +\n",
    "         \\mathbf{w}_n^\\top \\mathbf{A}^\\top \\mathbf{A} \\mathbf{w}_n -\n",
    "         2 \\mathbf{b}^\\top \\mathbf{y}_n +\n",
    "         2 \\mathbf{b}^\\top \\mathbf{A} \\mathbf{w}_n +\n",
    "         \\mathbf{b}^\\top \\mathbf{b}\n",
    "       \\right) -\n",
    "       \\log \\mathbf{h}^\\top \\mathbf{h}\\\\\n",
    "    &= \\sum_{n = 1}^N -\\sigma^{-2} \\left[\n",
    "         \\left( \\mathbf{y}_n - \\mathbf{b} \\right)^\\top\n",
    "           \\left( \\mathbf{y}_n - \\mathbf{b} \\right) +\n",
    "         \\mathbf{w}_n^\\top \\mathbf{A}^\\top \\mathbf{A} \\mathbf{w}_n -\n",
    "         2 \\left( \\mathbf{y}_n - \\mathbf{b} \\right)^\\top \\mathbf{A} \\mathbf{w}_n\n",
    "       \\right] -\n",
    "       \\log \\mathbf{h}^\\top \\mathbf{h}\n",
    "\n",
    "where :math:`\\mathbf{w}_n = \\boldsymbol{\\mu}_n + \\boldsymbol{\\Phi}_n \\mathbf{h}`\n",
    "and\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\frac{\\partial L}{\\partial \\mathbf{h}}\n",
    "    &= \\sum_{n = 1}^N\n",
    "         -2 \\sigma^{-2} \\left(\n",
    "           \\boldsymbol{\\mu}_n^\\top \\mathbf{A}^\\top \\mathbf{A}\n",
    "               \\boldsymbol{\\Phi}_n +\n",
    "           \\mathbf{h}^\\top \\boldsymbol{\\Phi}_n^\\top \\mathbf{A}^\\top \\mathbf{A}\n",
    "               \\boldsymbol{\\Phi}_n\n",
    "         \\right) +\n",
    "         2 \\sigma^{-2} \\left( \\mathbf{y}_n - \\mathbf{b} \\right)^\\top \\mathbf{A}\n",
    "             \\boldsymbol{\\Phi}_n -\n",
    "         \\frac{2}{\\mathbf{h}^\\top \\mathbf{h}} \\mathbf{h}^\\top\n",
    "       & \\quad & \\text{(C.28), (C.33)}\\\\\n",
    "   \\sigma^2 \\mathbf{h}^\\top +\n",
    "     \\sum_{n = 1}^N\n",
    "       \\mathbf{h}^\\top \\boldsymbol{\\Phi}_n^\\top \\mathbf{A}^\\top\n",
    "           \\mathbf{A} \\boldsymbol{\\Phi}_n\n",
    "    &= \\sum_{n = 1}^N\n",
    "         \\left(\n",
    "           \\mathbf{y}_n - \\mathbf{A} \\boldsymbol{\\mu}_n - \\mathbf{b}\n",
    "         \\right)^\\top \\mathbf{A} \\boldsymbol{\\Phi}_n\n",
    "       & \\quad & \\left\\Vert \\mathbf{h} \\right\\Vert_2 = 1\\\\\n",
    "   \\mathbf{h}\n",
    "    &= \\left(\n",
    "         \\sigma^2 \\mathbf{I} +\n",
    "         \\sum_{n = 1}^N\n",
    "           \\boldsymbol{\\Phi}_n^\\top \\mathbf{A}^\\top\n",
    "               \\mathbf{A} \\boldsymbol{\\Phi}_n\n",
    "       \\right)^{-1}\n",
    "       \\sum_{n = 1}^N\n",
    "         \\boldsymbol{\\Phi}_n^\\top \\mathbf{A}^\\top\n",
    "         \\left(\n",
    "           \\mathbf{y}_n - \\mathbf{A} \\boldsymbol{\\mu}_n - \\mathbf{b}\n",
    "         \\right)."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 17.10\n",
    "==============\n",
    "\n",
    "The activation function (9.66) can use :math:`\\mathbf{h}` to learn the\n",
    "coefficients of the gender classification model (9.65).\n",
    "\n",
    "The ICP approach to (17.31) already handles the lack of landmark points."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 17.11\n",
    "==============\n",
    "\n",
    "Each point in the point distribution could have mass-spring constraints, which\n",
    "can be used to iteratively redistribute the points towards the hidden portion of\n",
    "the shape."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 17.12\n",
    "==============\n",
    "\n",
    ":cite:`ghahramani1996algorithm` provides more details on\n",
    ":doc:`Modeling Complex Data Densities </nb/computer-vision-models-learning-and-inference-prince/chapter-07>`.\n",
    "\n",
    ":cite:`tipping1999mixtures` elaborates on the book's coverage of probabilistic\n",
    "principal component analysis by showing how to train a mixture of them.\n",
    "\n",
    ":cite:`peel2000robust` extends the book's analysis of t-distributions to\n",
    "mixture models.\n",
    "\n",
    ":cite:`de2003robust,khan2004robust` are alternative nonlinear shape models that\n",
    "should be read after the foregoing materials."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 17.13\n",
    "==============\n",
    "\n",
    "Rewrite :math:`\\mathbf{x}_\\cdot \\in \\mathbb{R}^2` as\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\mathbf{x}_\\cdot\n",
    "    &= \\alpha \\mathbf{a}_\\cdot + \\beta \\mathbf{b}_\\cdot +\n",
    "       \\gamma \\mathbf{c}_\\cdot\\\\\n",
    "    &= \\alpha \\mathbf{a}_\\cdot + \\beta \\mathbf{b}_\\cdot +\n",
    "       (1 - \\alpha - \\beta) \\mathbf{c}_\\cdot\n",
    "       & \\quad & \\alpha + \\beta + \\gamma = 1\\\\\n",
    "    &= \\alpha (\\mathbf{a}_\\cdot - \\mathbf{c}_\\cdot) +\n",
    "       \\beta (\\mathbf{b}_\\cdot - \\mathbf{c}_\\cdot) + \\mathbf{c}_\\cdot\\\\\n",
    "   \\mathbf{x}_\\cdot - \\mathbf{c}_\\cdot\n",
    "    &= \\begin{bmatrix}\n",
    "         \\mathbf{a}_\\cdot - \\mathbf{c}_\\cdot &\n",
    "             \\mathbf{b}_\\cdot - \\mathbf{c}_\\cdot\n",
    "       \\end{bmatrix}\n",
    "       \\begin{bmatrix} \\alpha\\\\ \\beta \\end{bmatrix}\n",
    "\n",
    "When :math:`\\mathbf{x}_\\cdot \\in \\mathbb{R}^3` then\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\begin{bmatrix} \\mathbf{x}_\\cdot\\\\ 1 \\end{bmatrix}\n",
    "    &= \\begin{bmatrix}\n",
    "         \\mathbf{a}_\\cdot & \\mathbf{b}_\\cdot & \\mathbf{c}_\\cdot\\\\\n",
    "         1 & 1 & 1\n",
    "       \\end{bmatrix}\n",
    "       \\begin{bmatrix} \\alpha\\\\ \\beta\\\\ \\gamma \\end{bmatrix}.\n",
    "\n",
    "Assuming that the original and transformed positions have the same barycentric\n",
    "coordinates, warping the whole image is equivalent to\n",
    "\n",
    "#. computing each position's barycentric coordinates,\n",
    "#. warping each triangle,\n",
    "#. solving for the position relative to the warped triangle."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 17.14\n",
    "==============\n",
    "\n",
    "(i)\n",
    "---\n",
    "\n",
    ".. math::\n",
    "\n",
    "   0 = \\tilde{\\mathbf{w}}^\\top\n",
    "       \\begin{bmatrix}\n",
    "         \\mathbf{A} & \\mathbf{b}\\\\ \\mathbf{b}^\\top & c\n",
    "       \\end{bmatrix}\n",
    "       \\tilde{\\mathbf{w}}\n",
    "    &= \\begin{bmatrix} \\tilde{\\mathbf{x}}^\\top & s \\end{bmatrix}\n",
    "       \\begin{bmatrix}\n",
    "         \\mathbf{A} & \\mathbf{b}\\\\ \\mathbf{b}^\\top & c\n",
    "       \\end{bmatrix}\n",
    "       \\begin{bmatrix} \\tilde{\\mathbf{x}}\\\\ s \\end{bmatrix}\\\\\n",
    "    &= \\begin{bmatrix} \\tilde{\\mathbf{x}}^\\top & s \\end{bmatrix}\n",
    "       \\begin{bmatrix}\n",
    "         \\mathbf{A} \\tilde{\\mathbf{x}} + \\mathbf{b} s\\\\\n",
    "         \\mathbf{b}^\\top \\tilde{\\mathbf{x}} + cs\n",
    "       \\end{bmatrix}\\\\\n",
    "    &= \\tilde{\\mathbf{x}}^\\top \\mathbf{A} \\tilde{\\mathbf{x}} +\n",
    "       2 \\tilde{\\mathbf{x}}^\\top \\mathbf{b} s + cs^2\\\\\n",
    "    &= \\gamma + \\beta s + \\alpha s^2\n",
    "\n",
    "(ii)\n",
    "----\n",
    "\n",
    "There exists a real unique solution for :math:`s` when the discriminant is zero:\n",
    "\n",
    ".. math::\n",
    "\n",
    "   0 = \\beta^2 - 4 \\alpha \\gamma\n",
    "    &= \\left( 2 \\tilde{\\mathbf{x}}^\\top \\mathbf{b} \\right)^2 -\n",
    "       4c \\left( \\tilde{\\mathbf{x}}^\\top \\mathbf{A} \\tilde{\\mathbf{x}} \\right)\\\\\n",
    "    &= \\tilde{\\mathbf{x}}^\\top \\mathbf{b} \\mathbf{b}^\\top \\tilde{\\mathbf{x}} -\n",
    "       c \\tilde{\\mathbf{x}}^\\top \\mathbf{A} \\tilde{\\mathbf{x}}\\\\\n",
    "    &= \\tilde{\\mathbf{x}}^\\top\n",
    "       \\left( \\mathbf{b} \\mathbf{b}^\\top - c \\mathbf{A} \\right)\n",
    "       \\tilde{\\mathbf{x}}\\\\\n",
    "    &= \\tilde{\\mathbf{x}}^\\top \\mathbf{C} \\tilde{\\mathbf{x}}\n",
    "\n",
    "If the camera has intrinsic matrix :math:`\\boldsymbol{\\Lambda}`, then\n",
    ":math:`\\mathbf{w} = \\boldsymbol{\\Lambda}^{-1} \\mathbf{x} = \\tilde{\\mathbf{x}}`\n",
    "and\n",
    ":math:`\\mathbf{C} =\n",
    "\\boldsymbol{\\Lambda}^{-\\top}\n",
    "\\left( \\mathbf{b} \\mathbf{b}^\\top - c \\mathbf{A} \\right)\n",
    "\\boldsymbol{\\Lambda}^{-1}`."
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "celltoolbar": "Raw Cell Format",
  "kernelspec": {
   "display_name": "Python [default]",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}