{
 "cells": [
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "********************************\n",
    "Learning and Inference in Vision\n",
    "********************************\n",
    "\n",
    "Regression\n",
    "==========\n",
    "\n",
    "Suppose the likelihood of the data is independent of the model.\n",
    "\n",
    ".. math::\n",
    "\n",
    "   Pr(\\boldsymbol{\\theta} \\mid w, x)\n",
    "    &= \\frac{Pr(\\boldsymbol{\\theta}, w, x)}{Pr(w, x)}\\\\\n",
    "    &= \\frac{\n",
    "         Pr(w \\mid x, \\boldsymbol{\\theta})\n",
    "         Pr(x \\mid \\boldsymbol{\\theta}) Pr(\\boldsymbol{\\theta})\n",
    "       }{\n",
    "         Pr(w \\mid x) Pr(x)\n",
    "       }\\\\\n",
    "    &= \\frac{\n",
    "         Pr(w \\mid x, \\boldsymbol{\\theta}) Pr(\\boldsymbol{\\theta})\n",
    "       }{\n",
    "         Pr(w \\mid x)\n",
    "       }\n",
    "       & \\quad & Pr(x \\mid \\boldsymbol{\\theta}) = Pr(x).\n",
    "\n",
    "Applications\n",
    "============\n",
    "\n",
    "Suppose the world state is independent of the model :cite:`engelhardtbmlemapbr`.\n",
    "\n",
    ".. math::\n",
    "\n",
    "   Pr(w \\mid x, \\boldsymbol{\\theta})\n",
    "    &= \\frac{Pr(w, x, \\boldsymbol{\\theta})}{Pr(x, \\boldsymbol{\\theta})}\\\\\n",
    "    &= \\frac{\n",
    "         Pr(x \\mid w, \\boldsymbol{\\theta})\n",
    "         Pr(w \\mid \\boldsymbol{\\theta}) Pr(\\boldsymbol{\\theta})\n",
    "       }{\n",
    "         Pr(x \\mid \\boldsymbol{\\theta}) Pr(\\boldsymbol{\\theta})\n",
    "       }\\\\\n",
    "    &= \\frac{\n",
    "         Pr(x \\mid w, \\boldsymbol{\\theta}) Pr(w)\n",
    "       }{\n",
    "         Pr(x \\mid \\boldsymbol{\\theta})\n",
    "       }\n",
    "       & \\quad & Pr(w \\mid \\boldsymbol{\\theta}) = Pr(w)."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 6.1\n",
    "============\n",
    "\n",
    "(i), (iii), and (iv) are classification problems while (ii) and (v) are\n",
    "regression problems.\n",
    "\n",
    "(i)\n",
    "---\n",
    "\n",
    ":math:`\\mathbf{w}` represents a discrete state describing whether a face is male\n",
    "or female.\n",
    "\n",
    ":math:`\\mathbf{x}` is an image of a face that have been discretized into pixels\n",
    "spanning some color space.\n",
    "\n",
    "(ii)\n",
    "----\n",
    "\n",
    ":math:`\\mathbf{w}` represents a continuous state describing the 3D pose of a\n",
    "human body, which covers all physically possible orientations and positions.\n",
    "\n",
    ":math:`\\mathbf{x}` is an image of a body that have been discretized into pixels\n",
    "spanning some color space.\n",
    "\n",
    "(iii)\n",
    "-----\n",
    "\n",
    ":math:`\\mathbf{w}` represents a discrete state spanning the four suits\n",
    "(hearts, diamond, clubs, spades) of a playing card.\n",
    "\n",
    ":math:`\\mathbf{x}` is an image of a playing card that have been discretized into\n",
    "pixels spanning some color space.\n",
    "\n",
    "(iv)\n",
    "----\n",
    "\n",
    ":math:`\\mathbf{w}` represents a discrete binary state describing whether a\n",
    "face image matches another face image.\n",
    "\n",
    ":math:`\\mathbf{x}` consists of a pair of face images where each image\n",
    "has been discretized into pixels spanning some color space.\n",
    "\n",
    "(v)\n",
    "---\n",
    "\n",
    ":math:`\\mathbf{w}` represents a continuous state describing the 3D position of a\n",
    "point.\n",
    "\n",
    ":math:`\\mathbf{x}` consists of the images produced by a set of cameras and their\n",
    "correspondences; all of which have been discretized into pixels spanning\n",
    "arbitrary color spaces."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    ".. _prince2012computer-ex-6.2:\n",
    "\n",
    "Exercise 6.2\n",
    "============\n",
    "\n",
    "Discriminative\n",
    "--------------\n",
    "\n",
    "According to :cite:`brunnerj312f12mlm,brunnerj312f12lr`, this is known as\n",
    "multinomial logistic regression.\n",
    "\n",
    "Use a categorical distribution to model the univariate discrete multi-valued\n",
    "world state :math:`\\mathbf{w}` as :math:`Pr(\\mathbf{w})`.\n",
    "\n",
    "Let :math:`L_m(x) = \\phi_{m, 0} + \\phi_{m, 1} x` denote the linear function of\n",
    "the data :math:`x` for :math:`m = 1, 2, \\ldots, M`.\n",
    "\n",
    "Define the probability of observing one of the :math:`M` possible outcomes as\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\lambda_M(x) =\n",
    "       \\left( 1 + \\sum_{i = 1}^{M - 1} \\exp L_i(x) \\right)^{-1}\n",
    "   \\quad \\land \\quad\n",
    "   \\lambda_m(x) = \\lambda_M \\exp L_m(x)\n",
    "\n",
    "where :math:`\\sum_m \\lambda_m(x) = 1` for :math:`x = 1, 2, \\ldots, K`.\n",
    "\n",
    "Applying the same notations as (3.8) gives\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\DeclareMathOperator{\\CatDist}{Cat}\n",
    "   Pr(\\mathbf{w} \\mid x, \\boldsymbol{\\theta}) =\n",
    "   \\CatDist_{\\mathbf{w}}\\left[ \\boldsymbol{\\lambda}(x) \\right]\n",
    "\n",
    "where :math:`\\boldsymbol{\\theta} = \\{ \\phi_{1 \\ldots M \\times 0 \\ldots 1} \\}`,\n",
    ":math:`\\mathbf{w} = \\mathbf{e}_m`, and\n",
    ":math:`\\boldsymbol{\\lambda} = \\left( \\lambda_1, \\ldots, \\lambda_M \\right)^\\top`.\n",
    "\n",
    "Generative\n",
    "----------\n",
    "\n",
    "Since the world state is a discrete multi-valued univariate, define a prior\n",
    "distribution over the world state as\n",
    "\n",
    ".. math::\n",
    "\n",
    "   Pr(\\mathbf{w}) = \\CatDist_{\\mathbf{w}}\\left[ \\boldsymbol{\\lambda}' \\right]\n",
    "\n",
    "where :math:`\\mathbf{w} = \\mathbf{e}_m` and\n",
    ":math:`\\boldsymbol{\\lambda}' = \\left( \\lambda'_1, \\ldots, \\lambda'_M \\right)^\\top`.\n",
    "\n",
    "Use a categorical distribution to model the discrete multi-valued univariate\n",
    "data :math:`\\mathbf{x}` as :math:`Pr(\\mathbf{x})`.\n",
    "\n",
    "Let :math:`L_k(w) = \\phi_{k, 0} + \\phi_{k, 1} w` denote the linear function of\n",
    "the world state :math:`w` for :math:`k = 1, 2, \\ldots, K`.\n",
    "\n",
    "Define the probability of observing one of the :math:`K` possible outcomes as\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\lambda_K(w) =\n",
    "       \\left( 1 + \\sum_{i = 1}^{K - 1} \\exp L_i(w) \\right)^{-1}\n",
    "   \\quad \\land \\quad\n",
    "   \\lambda_k = \\lambda_K \\exp L_k(w)\n",
    "\n",
    "where :math:`\\sum \\lambda_k(w) = 1` for all :math:`w = 1, 2, \\ldots, M`.\n",
    "\n",
    "Applying the same notations as (3.8) yields\n",
    "\n",
    ".. math::\n",
    "\n",
    "   Pr(\\mathbf{x} \\mid w, \\boldsymbol{\\theta}) =\n",
    "   \\CatDist_{\\mathbf{x}}\\left[ \\boldsymbol{\\lambda}(w) \\right]\n",
    "\n",
    "where :math:`\\boldsymbol{\\theta} =\n",
    "\\left\\{ \\boldsymbol{\\lambda}', \\phi_{1 \\ldots K \\times 0 \\ldots 1} \\right\\}`,\n",
    ":math:`\\mathbf{x} = \\mathbf{e}_k`, and\n",
    ":math:`\\boldsymbol{\\lambda} = \\left( \\lambda_1, \\ldots, \\lambda_K \\right)^\\top`."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 6.3\n",
    "============\n",
    "\n",
    "Since the world state is univariate and continuous, define a prior\n",
    "distribution over the world state as\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\DeclareMathOperator{\\NormDist}{Norm}\n",
    "   Pr(w) = \\NormDist_w\\left[ \\mu_p, \\sigma_p^2 \\right].\n",
    "\n",
    "Use a Bernoulli distribution to model the univariate binary discrete data\n",
    ":math:`x` as :math:`Pr(x)`.\n",
    "\n",
    "Let :math:`\\lambda(w) = \\phi_0 + \\phi_1 w` denote a linear function of the world\n",
    "state :math:`w`.  The generative regression model is then\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\DeclareMathOperator{\\BernDist}{Bern}\n",
    "   \\DeclareMathOperator{\\sigmoid}{sig}\n",
    "   Pr(x \\mid w, \\boldsymbol{\\theta}) =\n",
    "   \\BernDist_x\\left[ \\sigmoid\\left( \\lambda(w) \\right) \\right] =\n",
    "   \\BernDist_x\\left[ \\frac{1}{1 + \\exp\\left[ -\\phi_0 - \\phi_1 w \\right]} \\right]\n",
    "\n",
    "where :math:`\\boldsymbol{\\theta} = \\{ \\mu_p, \\sigma_p^2, \\phi_0, \\phi_1 \\}`."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 6.4\n",
    "============\n",
    "\n",
    "Use a beta distribution to model the univariate continuous world state\n",
    ":math:`w \\in \\{ 0, 1 \\}` as :math:`Pr(w)`.\n",
    "\n",
    "Since the data :math:`x` comes from a univariate continuous distribution, we can\n",
    "arbitrarily model that as\n",
    ":math:`Pr(x) = \\NormDist_x\\left[ \\mu, \\sigma^2 \\right]` and represent the\n",
    "parameters of the beta distribution in those terms (see\n",
    ":ref:`Exercise 3.3 <prince2012computer-ex-3.3>`):\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\alpha = \\mu \\left( \\frac{\\mu (1 - \\mu)}{\\sigma^2} - 1 \\right)\n",
    "   \\quad \\text{and} \\quad\n",
    "   \\beta = (1 - \\mu) \\left( \\frac{\\mu (1 - \\mu)}{\\sigma^2} - 1 \\right).\n",
    "\n",
    "The discriminative regression model is then\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\DeclareMathOperator{\\BetaDist}{Beta}\n",
    "   Pr(w \\mid x, \\boldsymbol{\\theta}) = \\BetaDist_w[\\alpha, \\beta]\n",
    "\n",
    "where :math:`\\boldsymbol{\\theta} = \\left\\{ \\mu, \\sigma^2 \\right\\}`."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    ".. _prince2012computer-ex-6.5:\n",
    "\n",
    "Exercise 6.5\n",
    "============\n",
    "\n",
    ".. math::\n",
    "\n",
    "   L\n",
    "    &= \\sum_{i = 1}^I\n",
    "         \\log \\NormDist_{w_i}\\left[ \\phi_0 + \\phi_1 x_i, \\sigma^2 \\right]\\\\\n",
    "    &= \\sum_{i = 1}^I \\log\n",
    "         \\frac{1}{\\sqrt{2 \\pi \\sigma^2}}\n",
    "         \\exp\\left[\n",
    "           \\frac{(w_i - \\phi_0 - \\phi_1 x_i)^2}{\\sigma^2}\n",
    "         \\right]^{-0.5}\\\\\n",
    "    &= -\\frac{I}{2} \\log 2 \\pi - \\frac{I}{2} \\log \\sigma^2 -\n",
    "       \\frac{1}{2 \\sigma^2} \\sum_{i = 1}^I (w_i - \\phi_0 - \\phi_1 x_i)^2\n",
    "\n",
    "(a)\n",
    "---\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\frac{\\partial L}{\\partial \\phi_0}\n",
    "    &= -\\frac{1}{2 \\sigma^2}\n",
    "       \\sum_{i = 1}^I 2 (w_i - \\phi_0 - \\phi_1 x_i) (-1)\\\\\n",
    "   0 &= \\frac{1}{2 \\sigma^2} \\sum_{i = 1}^I w_i - \\phi_0 - \\phi_1 x_i\\\\\n",
    "   \\phi_0 &= \\frac{1}{I} \\sum_{i = 1}^I w_i - \\phi_1 x_i\n",
    "\n",
    "(b)\n",
    "---\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\frac{\\partial L}{\\partial \\phi_1}\n",
    "    &= -\\frac{1}{2 \\sigma^2}\n",
    "       \\sum_{i = 1}^I 2 (w_i - \\phi_0 - \\phi_1 x_i) (-x_i)\\\\\n",
    "   0 &= \\frac{1}{2 \\sigma^2}\n",
    "        \\sum_{i = 1}^I w_i x_i - \\phi_0 x_i - \\phi_1 x_i^2\\\\\n",
    "   \\phi_1 &= \\frac{\\sum_{i = 1}^I x_i (w_i - \\phi_0)}{\\sum_{i = 1}^I x_i^2}\n",
    "\n",
    "(c)\n",
    "---\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\frac{\\partial L}{\\partial \\sigma}\n",
    "    &= -\\frac{I}{2 \\sigma^2} 2 \\sigma -\n",
    "       \\frac{1}{2 \\sigma^3} (-2) \\sum_{i = 1}^I (w_i - \\phi_0 - \\phi_1 x_i)^2\\\\\n",
    "   \\frac{I}{\\sigma}\n",
    "    &= \\frac{1}{\\sigma^3} \\sum_{i = 1}^I (w_i - \\phi_0 - \\phi_1 x_i)^2\\\\\n",
    "   \\sigma^2 &= \\frac{1}{I} \\sum_{i = 1}^I (w_i - \\phi_0 - \\phi_1 x_i)^2"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 6.6\n",
    "============\n",
    "\n",
    ".. math::\n",
    "\n",
    "   Pr(w_i \\mid x_i)\n",
    "    &= \\frac{Pr(w_i, x_i)}{Pr(x_i)}\\\\\n",
    "    &= \\NormDist_{w_i}\\left[\n",
    "         \\mu_w + \\sigma_{xw}^2 \\sigma_{xx}^{-1} (x_i - \\mu_x),\n",
    "         \\sigma_{ww}^2 - \\sigma_{xw}^2 \\sigma_{xx}^{-1} \\sigma_{xw}^2\n",
    "       \\right]\n",
    "       & \\quad & \\text{(5.13) and Exercise 5.5}\\\\\n",
    "    &= \\NormDist_{w_i}\\left[ \\phi_0 + \\phi_1 x_i, \\sigma^2 \\right]\n",
    "       & \\quad & \\text{Exercise 6.5, (a), (b), (c)}\n",
    "\n",
    "where\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\phi_0 &= \\mu_w - \\sigma_{xw}^2 \\sigma_{xx}^{-1} \\mu_x\\\\\\\\\n",
    "   \\phi_1 &= \\sigma_{xw}^2 \\sigma_{xx}^{-1}\\\\\\\\\n",
    "   \\sigma^2 &= \\sigma_{ww}^2 - \\sigma_{xw}^2 \\sigma_{xx}^{-1} \\sigma_{xw}^2.\n",
    "\n",
    "See :ref:`Exercise 5.5 <prince2012computer-ex-5.5>` and\n",
    ":ref:`Exercise 6.5 <prince2012computer-ex-6.5>` for more details.\n",
    "\n",
    "(a)\n",
    "---\n",
    "\n",
    "In order to simplify notations, rewrite the MLE of :math:`\\phi_0` as\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\phi_0 = \\frac{1}{I} \\sum_{i = 1}^I w_i - \\phi_1 x_i = \\mu_w - \\phi_1 \\mu_x\n",
    "\n",
    "where :math:`\\mu_w = I^{-1} \\sum_{i = 1}^I w_i` and\n",
    ":math:`\\mu_x = I^{-1} \\sum_{i = 1}^I x_i`.\n",
    "\n",
    "(b)\n",
    "---\n",
    "\n",
    "In order to simplify notations, rewrite the MLE of :math:`\\phi_1` as\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\phi_1 &= \\frac{\\sum_{i = 1}^I x_i (w_i - \\phi_0)}{\\sum_{i = 1}^I x_i^2}\\\\\n",
    "   \\phi_1 \\sum_{i = 1}^I x_i^2\n",
    "    &= \\sum_{i = 1}^I x_i w_i - x_i (\\mu_w - \\phi_1 \\mu_x)\n",
    "       & \\quad & \\text{(a)}\\\\\n",
    "   \\phi_1\n",
    "    &= \\frac{\n",
    "         \\sum_{i = 1}^I x_i w_i - x_i \\mu_w\n",
    "       }{\n",
    "         \\sum_{i = 1}^I x_i^2 - x_i \\mu_x\n",
    "       }\\\\\n",
    "    &= \\frac{\n",
    "         I^{-1} \\sum_{i = 1}^I x_i w_i - x_i \\mu_w\n",
    "       }{\n",
    "         I^{-1} \\sum_{i = 1}^I x_i^2 - x_i \\mu_x\n",
    "       }\\\\\n",
    "    &= \\left(\n",
    "         \\frac{\\sum_{i = 1}^I x_i w_i}{I} - \\mu_x \\mu_w\n",
    "       \\right)\n",
    "       \\left(\n",
    "         \\frac{\\sum_{i = 1}^I x_i^2}{I} - \\mu_x^2\n",
    "       \\right)^{-1}\\\\\n",
    "    &= \\frac{\n",
    "         \\sum_{i = 1}^I x_i w_i - \\mu_x \\mu_w\n",
    "       }{\n",
    "         \\sum_{i = 1}^I x_i^2 - \\mu_x^2\n",
    "       }.\n",
    "\n",
    "(c)\n",
    "---\n",
    "\n",
    "Substituting in the MLE of :math:`\\phi_0` and :math:`\\phi_1` into\n",
    ":math:`\\sigma^2` gives\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\sigma^2 &= I^{-1} \\sum_{i = 1}^I (w_i - \\phi_0 - \\phi_1 x_i)^2\\\\\n",
    "    &= I^{-1} \\sum_{i = 1}^I\n",
    "         w_i^2 - 2 w_i (\\phi_0 + \\phi_1 x_i) + (\\phi_0 + \\phi_1 x_i)^2\\\\\n",
    "    &= I^{-1} \\sum_{i = 1}^I\n",
    "         w_i^2 - 2 w_i \\phi_0 - 2 w_i x_i \\phi_1 +\n",
    "         \\phi_0^2 + 2 \\phi_0 \\phi_1 x_i + \\phi_1^2 x_i^2\\\\\n",
    "    &= I^{-1} \\sum_{i = 1}^I\n",
    "         w_i^2 +\n",
    "         \\phi_1^2 \\left(\n",
    "           x_i^2 - 2 x_i \\mu_x + \\mu_x^2\n",
    "         \\right) +\n",
    "         \\phi_1 \\left(\n",
    "           2 \\mu_w x_i - 2 \\mu_x \\mu_w - 2 x_i w_i + 2 \\mu_x w_i\n",
    "         \\right) +\n",
    "         \\left(\n",
    "           \\mu_w^2 - 2 \\mu_w w_i\n",
    "         \\right)\\\\\n",
    "    &= I^{-1} \\sum_{i = 1}^I\n",
    "         w_i^2 +\n",
    "         \\phi_1^2 \\left( x_i^2 - \\mu_x^2 \\right) -\n",
    "         2 \\phi_1 \\left( x_i w_i - \\mu_x \\mu_w \\right) -\n",
    "         \\mu_w^2\\\\\n",
    "    &= \\frac{\\sum_{i = 1}^I w_i^2 - \\mu_w^2}{I} +\n",
    "       \\frac{\\phi_1^2}{I} \\left( \\sum_{i = 1}^I x_i^2 - \\mu_x^2 \\right) -\n",
    "       \\frac{2 \\phi_1}{I} \\left( \\sum_{i = 1}^I x_i w_i - \\mu_x \\mu_w \\right)\\\\\n",
    "    &= \\frac{\\sum_{i = 1}^I w_i^2 - \\mu_w^2}{I} -\n",
    "       \\left(\n",
    "         \\frac{\\sum_{i = 1}^I x_i w_i}{I} - \\mu_x \\mu_w\n",
    "       \\right)^2\n",
    "       \\left(\n",
    "         \\frac{\\sum_{i = 1}^I x_i^2}{I} - \\mu_x^2\n",
    "       \\right)^{-1}\\\\\n",
    "    &= \\frac{\\sum_{i = 1}^I (w_i - \\mu_w)^2}{I} -\n",
    "       \\left(\n",
    "         I^{-1} \\sum_{i = 1}^I (x_i - \\mu_x) (w_i - \\mu_w)\n",
    "       \\right)^2\n",
    "       \\left(\n",
    "         I^{-1} \\sum_{i = 1}^I (x_i - \\mu_x)^2\n",
    "       \\right)^{-1}\\\\\n",
    "    &= \\sigma_{ww}^2 - \\sigma_{xw}^2 \\sigma_{xx}^{-1} \\sigma_{xw}^2\n",
    "       & \\quad & \\text{definition of covariance with uniform probability.}"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 6.7\n",
    "============\n",
    "\n",
    "(1)\n",
    "---\n",
    "\n",
    "Assuming :math:`Pr(w)` has a uniform prior simplifies (6.11) to\n",
    "\n",
    ".. math::\n",
    "\n",
    "   Pr(w \\mid x) =\n",
    "   \\frac{\n",
    "     Pr(x \\mid w) Pr(w)\n",
    "   }{\n",
    "     \\sum_{w \\in \\{ 0, 1 \\}} Pr(x \\mid w) Pr(w)\n",
    "   } =\n",
    "   \\frac{\n",
    "     Pr(x \\mid w)\n",
    "   }{\n",
    "     Pr(x \\mid w = 1) + Pr(x \\mid w = 0)\n",
    "   }.\n",
    "\n",
    "The points on the decision boundary obey\n",
    "\n",
    ".. math::\n",
    "\n",
    "   Pr(w = 0 \\mid x) &= Pr(w = 1 \\mid x)\\\\\n",
    "   Pr(x \\mid w = 0) &= Pr(x \\mid w = 1)\\\\\n",
    "   \\NormDist_x\\left[ \\mu_0, \\sigma_0^2 \\right]\n",
    "    &= \\NormDist_x\\left[ \\mu_1, \\sigma_1^2 \\right]\\\\\n",
    "   -\\frac{1}{2} \\log 2 \\pi - \\frac{1}{2} \\log \\sigma_0^2 -\n",
    "       \\frac{(x - \\mu_0)^2}{2 \\sigma_0^2}\n",
    "    &= -\\frac{1}{2} \\log 2 \\pi - \\frac{1}{2} \\log \\sigma_1^2 -\n",
    "       \\frac{(x - \\mu_1)^2}{2 \\sigma_1^2}\n",
    "       & \\quad & \\text{rearrange into a quadratic equation using log normals}\\\\\n",
    "   \\log \\sigma_0^2 + \\frac{(x - \\mu_0)^2}{\\sigma_0^2}\n",
    "    &= \\log \\sigma_1^2 + \\frac{(x - \\mu_1)^2}{\\sigma_1^2}\\\\\n",
    "   a x^2 + bx + c &= 0\n",
    "\n",
    "where\n",
    "\n",
    ".. math::\n",
    "\n",
    "   a &= \\sigma_0^{-2} - \\sigma_1^{-2}\\\\\\\\\n",
    "   b &= 2 \\left( \\mu_1 \\sigma_1^{-2} - \\mu_0 \\sigma_0^{-2} \\right)\\\\\\\\\n",
    "   c &= \\mu_0^2 \\sigma_0^{-2} - \\mu_1^2 \\sigma_1^{-2} +\n",
    "        \\log \\sigma_0^2 - \\log \\sigma_1^2.\n",
    "\n",
    "(2)\n",
    "---\n",
    "\n",
    "The shape of the decision boundary for the logistic regression model have\n",
    "the form of\n",
    "\n",
    ".. math::\n",
    "\n",
    "   Pr(w = 0 \\mid x) &= Pr(w = 1 \\mid x)\\\\\n",
    "   \\BernDist_{w = 0}\\left[ \\sigmoid\\left( \\phi_0 + \\phi_1 x \\right) \\right]\n",
    "    &= \\BernDist_{w = 1}\\left[\n",
    "         \\sigmoid\\left( \\phi_0 + \\phi_1 x \\right)\n",
    "       \\right]\\\\\n",
    "   1 - \\sigmoid\\left(\\phi_0 + \\phi_1 x \\right)\n",
    "    &= \\sigmoid\\left( \\phi_0 + \\phi_1 x \\right)\\\\\n",
    "   1 + \\exp \\left( -\\phi_0 - \\phi_1 x \\right) &= 2\\\\\n",
    "   \\phi_1 x + \\phi_0 &= 0."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 6.8\n",
    "============\n",
    "\n",
    "The following uses the results of\n",
    ":ref:`Exercise 6.7 <prince2012computer-ex-5.7>`.\n",
    "\n",
    "(1)\n",
    "---\n",
    "\n",
    "Suppose :math:`Pr(w)` is uniform and\n",
    ":math:`\\mu_0 = 0`, :math:`\\sigma_0^2 = \\sigma^2`,\n",
    ":math:`\\mu_1 = 0`, :math:`\\sigma_1^2 = 1.5 \\sigma^2`,\n",
    "\n",
    ".. math::\n",
    "\n",
    "   a &= \\sigma_0^{-2} - \\sigma_1^{-2} = \\frac{1}{3 \\sigma^2}\\\\\\\\\n",
    "   b &= 2 \\left( \\mu_1 \\sigma_1^{-2} - \\mu_0 \\sigma_0^{-2} \\right) = 0\\\\\\\\\n",
    "   c &= \\mu_0^2 \\sigma_0^{-2} - \\mu_1^2 \\sigma_1^{-2} +\n",
    "        \\log \\sigma_0^2 - \\log \\sigma_1^2\n",
    "      = -\\log 1.5\n",
    "\n",
    "(2)\n",
    "---\n",
    "\n",
    "In order for the discriminative classifier to have the same decision boundary,\n",
    "a quadratic function\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\phi_2 x^2 + \\phi_1 x + \\phi_0\n",
    "\n",
    "needs to be used where\n",
    "\n",
    ".. math::\n",
    "\n",
    "   \\begin{gather*}\n",
    "     \\phi_2 = a\\\\\n",
    "     \\phi_1 = 0\\\\\n",
    "     \\phi_0 = c.\n",
    "   \\end{gather*}"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 6.9\n",
    "============\n",
    "\n",
    "Let :math:`G(\\mathbf{x})` and :math:`D(\\mathbf{x})` denote the number of\n",
    "parameters a model has as a function of the dimensionality of\n",
    ":math:`\\mathbf{x} \\in \\mathbb{R}^n`.\n",
    "\n",
    "Generative Model\n",
    "----------------\n",
    "\n",
    "Suppose the prior is uniform and the model parameters are\n",
    ":math:`\\boldsymbol{\\theta} = \\left\\{\n",
    "\\boldsymbol{\\mu}_0, \\boldsymbol{\\mu}_1,\n",
    "\\boldsymbol{\\Sigma}_0, \\boldsymbol{\\Sigma}_1 \\right\\}`.\n",
    "\n",
    "Recall that a symmetric matrix (e.g. covariance matrix) has\n",
    ":math:`\\frac{n (n + 1)}{2}` scalars.\n",
    "\n",
    ".. math::\n",
    "\n",
    "   G(\\mathbf{x}) = 2n + 2 \\frac{n (n + 1)}{2} = n^2 + 3n.\n",
    "\n",
    "Discriminative Model\n",
    "--------------------\n",
    "\n",
    "The model parameters consists of\n",
    ":math:`\\boldsymbol{\\theta} = \\{ \\phi_0, \\boldsymbol{\\phi} \\}`.\n",
    "\n",
    ".. math::\n",
    "\n",
    "   D(\\mathbf{x}) = n + 1."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Exercise 6.10\n",
    "=============\n",
    "\n",
    "The goal now is to infer a multi-valued label :math:`w_n \\in \\{0, 1, 2\\}` that\n",
    "indicates whether the :math:`n\\text{th}` pixel in the image is part of a known\n",
    "background :math:`(w = 0)`, foreground :math:`(w = 1)`, or shadow\n",
    ":math:`(w = 2)`.\n",
    "\n",
    "The prior :math:`Pr(w)` would be a categorical distribution.\n",
    "\n",
    "Since the background is known and there is lighting in the scene, shadows\n",
    "will make the pixels \"dimmer\".  In addition to Equations (6.16) and (6.17), the\n",
    "class conditional distribution of the shadows could be modeled as\n",
    "\n",
    ".. math::\n",
    "\n",
    "   Pr(\\mathbf{x}_n \\mid w = 2) =\n",
    "   \\NormDist_{\\mathbf{x}_n}\\left[\n",
    "     \\boldsymbol{\\mu}_{n2}, \\boldsymbol{\\Sigma}_{n2}\n",
    "   \\right]."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    ".. rubric:: References\n",
    "\n",
    ".. bibliography:: chapter-06.bib"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "celltoolbar": "Raw Cell Format",
  "kernelspec": {
   "display_name": "Python [default]",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}