{ "cells": [ { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "***********************\n", "The Normal Distribution\n", "***********************\n", "\n", ":cite:`schon2011manipulating` provides a very nice exposition on this topic.\n", "However, its step (29) is not obvious. :cite:`wangre161mnd` is another\n", "interpretation that is possibly clearer and more concise." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. _prince2012computer-ex-5.1:\n", "\n", "Exercise 5.1\n", "============\n", "\n", "The following facts are useful in this proof:\n", "\n", ".. math::\n", "\n", " \\newcommand{\\E}[1]{\\operatorname{E}\\left[#1\\right]}\n", " \\newcommand{\\Cov}[1]{\\operatorname{cov}\\left(#1\\right)}\n", " \\begin{gather*}\n", " \\boldsymbol{\\mu} = \\E{\\mathbf{x}}\\\\\\\\\n", " \\E{\\mathbf{A} \\mathbf{x} + \\mathbf{b}} =\n", " \\mathbf{A} \\E{\\mathbf{x}} + \\mathbf{b}\\\\\\\\\n", " \\boldsymbol{\\Sigma} = \\Cov{\\mathbf{x}} =\n", " \\E{\n", " \\left( \\mathbf{x} - \\E{\\mathbf{x}} \\right)\n", " \\left( \\mathbf{x} - \\E{\\mathbf{x}} \\right)^\\top\n", " } =\n", " \\E{\n", " \\mathbf{x} \\mathbf{x}^\\top - \\mathbf{x} \\E{\\mathbf{x}}^\\top -\n", " \\E{\\mathbf{x}} \\mathbf{x}^\\top +\n", " \\E{\\mathbf{x}} \\E{\\mathbf{x}}^\\top\n", " }\\\\\\\\\n", " \\Cov{\\mathbf{A} \\mathbf{x} + \\mathbf{b}} =\n", " \\mathbf{A} \\Cov{\\mathbf{x}} \\mathbf{A}^\\top\n", " \\end{gather*}\n", "\n", "Let :math:`\\mathbf{y} = \\mathbf{A} \\mathbf{x} + \\mathbf{b}` where\n", ":math:`\\mathbf{A}` is nonsingular so that\n", ":math:`\\mathbf{x} = \\mathbf{A}^{-1} (\\mathbf{y} - \\mathbf{b})`. The mean and\n", "covariance are derived as\n", "\n", ".. math::\n", "\n", " \\boldsymbol{\\mu} &= \\E{\\mathbf{x}}\\\\\n", " &= \\E{\\mathbf{A}^{-1} (\\mathbf{y} - \\mathbf{b})}\\\\\n", " &= \\mathbf{A}^{-1} \\E{\\mathbf{y}} - \\mathbf{A}^{-1} \\mathbf{b}\\\\\n", " \\mathbf{A} \\boldsymbol{\\mu} + \\mathbf{b} &= \\E{\\mathbf{y}}\\\\\n", " &= \\tilde{\\boldsymbol{\\mu}}\n", "\n", "and\n", "\n", ".. math::\n", "\n", " \\boldsymbol{\\Sigma} &= \\Cov{\\mathbf{x}}\\\\\n", " &= \\Cov{\\mathbf{A}^{-1} (\\mathbf{y} - \\mathbf{b})}\\\\\n", " &= \\mathbf{A}^{-1} \\Cov{\\mathbf{y} - \\mathbf{b}} \\mathbf{A}^{-\\top}\\\\\n", " &= \\mathbf{A}^{-1} \\Cov{\\mathbf{y}} \\mathbf{A}^{-\\top}\\\\\n", " \\mathbf{A} \\boldsymbol{\\Sigma} \\mathbf{A}^\\top &= \\Cov{\\mathbf{y}}\\\\\n", " &= \\tilde{\\boldsymbol{\\Sigma}}.\n", "\n", "Thus\n", "\n", ".. math::\n", "\n", " \\DeclareMathOperator{\\NormDist}{Norm}\n", " Pr(\\mathbf{y}) =\n", " \\NormDist_{\\mathbf{y}}\\left[\n", " \\mathbf{A} \\boldsymbol{\\mu} + \\mathbf{b},\n", " \\mathbf{A} \\boldsymbol{\\Sigma} \\mathbf{A}^\\top\n", " \\right]." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 5.2\n", "============\n", "\n", "See :ref:`Exercise 5.1 ` for the derivations of\n", "the following terms.\n", "\n", "A solution to\n", "\n", ".. math::\n", "\n", " \\begin{aligned}\n", " \\mathbf{I} &= \\tilde{\\boldsymbol{\\Sigma}}\\\\\n", " &= \\mathbf{A} \\boldsymbol{\\Sigma} \\mathbf{A}^{\\top}\\\\\n", " \\mathbf{A}^{-1} \\mathbf{A}^{-\\top} &= \\boldsymbol{\\Sigma}\n", " \\end{aligned}\n", " \\quad \\text{and} \\quad\n", " \\begin{aligned}\n", " \\boldsymbol{0} &= \\tilde{\\boldsymbol{\\mu}}\\\\\n", " &= \\mathbf{A} \\boldsymbol{\\mu} + \\mathbf{b}\\\\\n", " \\mathbf{b} &= -\\mathbf{A} \\boldsymbol{\\mu}\n", " \\end{aligned}\n", "\n", "is to set :math:`\\mathbf{A} = \\boldsymbol{\\Sigma}^{-1 / 2}` resulting in\n", ":math:`\\mathbf{b} = -\\boldsymbol{\\Sigma}^{-1 / 2} \\boldsymbol{\\mu}`." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. _prince2012computer-ex-5.3:\n", "\n", "Exercise 5.3\n", "============\n", "\n", "Recall that\n", "\n", ".. math::\n", "\n", " Pr(\\mathbf{x} = \\begin{bmatrix} \\mathbf{x}_1\\\\ \\mathbf{x}_2 \\end{bmatrix})\n", " &= Pr(\\mathbf{x}_1, \\mathbf{x}_2)\\\\\n", " &= \\NormDist_{\\mathbf{x}}\\left[\n", " \\boldsymbol{\\mu}\n", " = \\begin{bmatrix}\n", " \\boldsymbol{\\mu}_1\\\\ \\boldsymbol{\\mu}_2\n", " \\end{bmatrix},\n", " \\boldsymbol{\\Sigma}\n", " = \\begin{bmatrix}\n", " \\boldsymbol{\\Sigma}_{11} & \\boldsymbol{\\Sigma}_{21}^\\top\\\\\n", " \\boldsymbol{\\Sigma}_{21} & \\boldsymbol{\\Sigma}_{22}\n", " \\end{bmatrix}\n", " \\right]\\\\\n", " &= \\frac{1}{\n", " (2 \\pi)^{D / 2} \\left\\vert \\boldsymbol{\\Sigma} \\right\\vert^{1 / 2}\n", " }\n", " \\exp\\left[\n", " -0.5 (\\mathbf{x} - \\boldsymbol{\\mu})^\\top\n", " \\boldsymbol{\\Sigma}^{-1}\n", " (\\mathbf{x} - \\boldsymbol{\\mu})\n", " \\right]\n", "\n", "where\n", ":math:`\\boldsymbol{\\Sigma}_{11} \\in \\mathbb{R}^{p \\times p}`,\n", ":math:`\\boldsymbol{\\Sigma}_{21} \\in \\mathbb{R}^{q \\times p}`,\n", ":math:`\\boldsymbol{\\Sigma}_{22} \\in \\mathbb{R}^{q \\times q}`, and\n", ":math:`p + q = D`.\n", "\n", "The Schur complement :math:`\\mathbf{S}` of :math:`\\boldsymbol{\\Sigma}_{11}` in\n", ":math:`\\boldsymbol{\\Sigma}` is defined as\n", "\n", ".. math::\n", "\n", " \\mathbf{S} =\n", " \\boldsymbol{\\Sigma}_{22} -\n", " \\boldsymbol{\\Sigma}_{21}\n", " \\boldsymbol{\\Sigma}_{11}^{-1} \\boldsymbol{\\Sigma}_{21}^\\top.\n", "\n", "It is symmetric positive definite because :math:`\\boldsymbol{\\Sigma}` is\n", "positive definite according to (5.7). This quantity is useful for deriving a\n", "closed-form expression for the inverse of the full covariance matrix:\n", "\n", ".. math::\n", "\n", " \\boldsymbol{\\Sigma}^{-1}\n", " &= \\left(\n", " \\begin{bmatrix}\n", " \\mathbf{I}_p & \\boldsymbol{0}\\\\\n", " \\boldsymbol{\\Sigma}_{21}^\\top \\boldsymbol{\\Sigma}_{11}^{-1} &\n", " \\mathbf{I}_q\n", " \\end{bmatrix}\n", " \\begin{bmatrix}\n", " \\boldsymbol{\\Sigma}_{11} & \\boldsymbol{0}\\\\\n", " \\boldsymbol{0} & \\mathbf{S}\n", " \\end{bmatrix}\n", " \\begin{bmatrix}\n", " \\mathbf{I}_p &\n", " \\boldsymbol{\\Sigma}_{11}^{-1} \\boldsymbol{\\Sigma}_{21}^\\top\\\\\n", " \\boldsymbol{0} & \\mathbf{I}_q\n", " \\end{bmatrix}\n", " \\right)^{-1}\\\\\n", " &= \\begin{bmatrix}\n", " \\mathbf{I}_p &\n", " -\\boldsymbol{\\Sigma}_{11}^{-1} \\boldsymbol{\\Sigma}_{21}^\\top\\\\\n", " \\boldsymbol{0} & \\mathbf{I}_q\n", " \\end{bmatrix}\n", " \\begin{bmatrix}\n", " \\boldsymbol{\\Sigma}_{11}^{-1} & \\boldsymbol{0}\\\\\n", " \\boldsymbol{0} & \\mathbf{S}^{-1}\n", " \\end{bmatrix}\n", " \\begin{bmatrix}\n", " \\mathbf{I}_p & \\boldsymbol{0}\\\\\n", " -\\boldsymbol{\\Sigma}_{21}^\\top \\boldsymbol{\\Sigma}_{11}^{-1} &\n", " \\mathbf{I}_q\n", " \\end{bmatrix}\\\\\n", " &= \\begin{bmatrix}\n", " \\boldsymbol{\\Sigma}_{11}^{-1} +\n", " \\boldsymbol{\\Sigma}_{11}^{-1}\n", " \\boldsymbol{\\Sigma}_{21}^\\top\n", " \\mathbf{S}^{-1}\n", " \\boldsymbol{\\Sigma}_{21}\n", " \\boldsymbol{\\Sigma}_{11}^{-1} &\n", " -\\boldsymbol{\\Sigma}_{11}^{-1}\n", " \\boldsymbol{\\Sigma}_{21}^\\top \\mathbf{S}^{-1}\\\\\n", " -\\mathbf{S}^{-1} \\boldsymbol{\\Sigma}_{21}\n", " \\boldsymbol{\\Sigma}_{11}^{-1} &\n", " \\mathbf{S}^{-1}\n", " \\end{bmatrix}.\n", "\n", "The foregoing expression simplifies the determinant of\n", ":math:`\\boldsymbol{\\Sigma}` to\n", "\n", ".. math::\n", "\n", " \\left\\vert \\boldsymbol{\\Sigma} \\right\\vert\n", " &= \\left\\vert\n", " \\begin{bmatrix}\n", " \\mathbf{I}_p & \\boldsymbol{0}\\\\\n", " \\boldsymbol{\\Sigma}_{21}^\\top \\boldsymbol{\\Sigma}_{11}^{-1} &\n", " \\mathbf{I}_q\\\\\n", " \\end{bmatrix}\n", " \\begin{bmatrix}\n", " \\boldsymbol{\\Sigma}_{11} & \\boldsymbol{0}\\\\\n", " \\boldsymbol{0} & \\mathbf{S}\n", " \\end{bmatrix}\n", " \\begin{bmatrix}\n", " \\mathbf{I}_p &\n", " \\boldsymbol{\\Sigma}_{11}^{-1} \\boldsymbol{\\Sigma}_{21}^\\top\\\\\n", " \\boldsymbol{0} & \\mathbf{I}_q\n", " \\end{bmatrix}\n", " \\right\\vert\\\\\n", " &= \\left\\vert\n", " \\begin{bmatrix}\n", " \\mathbf{I}_p & \\boldsymbol{0}\\\\\n", " \\boldsymbol{\\Sigma}_{21}^\\top \\boldsymbol{\\Sigma}_{11}^{-1} &\n", " \\mathbf{I}_q\\\\\n", " \\end{bmatrix}\n", " \\right\\vert\n", " \\left\\vert\n", " \\begin{bmatrix}\n", " \\boldsymbol{\\Sigma}_{11} & \\boldsymbol{0}\\\\\n", " \\boldsymbol{0} & \\mathbf{S}\n", " \\end{bmatrix}\n", " \\right\\vert\n", " \\left\\vert\n", " \\begin{bmatrix}\n", " \\mathbf{I}_p &\n", " \\boldsymbol{\\Sigma}_{11}^{-1} \\boldsymbol{\\Sigma}_{21}^\\top\\\\\n", " \\boldsymbol{0} & \\mathbf{I}_q\n", " \\end{bmatrix}\n", " \\right\\vert\n", " & \\quad & \\det(AB) = \\det(A) \\det(A)\\\\\n", " &= \\left\\vert\n", " \\begin{bmatrix}\n", " \\boldsymbol{\\Sigma}_{11} & \\boldsymbol{0}\\\\\n", " \\boldsymbol{0} & \\mathbf{S}\n", " \\end{bmatrix}\n", " \\right\\vert\n", " & \\quad & \\det\\left( \\mathbf{T}_n \\right) = \\prod_{k = 1}^n a_{kk}\\\\\n", " &= \\left\\vert \\boldsymbol{\\Sigma}_{11} \\right\\vert\n", " \\left\\vert \\mathbf{S} \\right\\vert\n", " & \\quad & \\text{block matrix determinant property.}\n", "\n", ".. math::\n", "\n", " & Pr(\\mathbf{x}_1)\\\\\n", " &= \\int Pr(\\mathbf{x}_1, \\mathbf{x}_2) d\\mathbf{x}_2\\\\\n", " &= \\int\n", " \\frac{1}{\n", " (2 \\pi)^{D / 2} \\left\\vert \\boldsymbol{\\Sigma} \\right\\vert^{1 / 2}\n", " }\n", " \\exp\\left[ -0.5\n", " \\begin{bmatrix}\n", " \\mathbf{x}_1 - \\boldsymbol{\\mu}_1\\\\\n", " \\mathbf{x}_2 - \\boldsymbol{\\mu}_2\n", " \\end{bmatrix}^\\top\n", " \\begin{bmatrix}\n", " \\boldsymbol{\\Lambda}_{11} & \\boldsymbol{\\Lambda}_{21}^\\top\\\\\n", " \\boldsymbol{\\Lambda}_{21} & \\boldsymbol{\\Lambda}_{22}\n", " \\end{bmatrix}\n", " \\begin{bmatrix}\n", " \\mathbf{x}_1 - \\boldsymbol{\\mu}_1\\\\\n", " \\mathbf{x}_2 - \\boldsymbol{\\mu}_2\n", " \\end{bmatrix}\n", " \\right] d\\mathbf{x}_2\n", " & \\quad & \\Lambda = \\Sigma^{-1} =\n", " \\begin{bmatrix}\n", " \\boldsymbol{\\Lambda}_{11} & \\boldsymbol{\\Lambda}_{21}^\\top\\\\\n", " \\boldsymbol{\\Lambda}_{21} & \\boldsymbol{\\Lambda}_{22}\n", " \\end{bmatrix}\\\\\n", " &= \\int\n", " \\frac{1}{\n", " (2 \\pi)^{(p + q) / 2}\n", " \\left\\vert \\boldsymbol{\\Sigma}_{11} \\right\\vert^{1 / 2}\n", " \\left\\vert \\mathbf{S} \\right\\vert^{1 / 2}\n", " }\n", " \\exp\\left[ -0.5\n", " \\left(\n", " (\\mathbf{x}_1 - \\boldsymbol{\\mu}_1)^\\top \\boldsymbol{\\Lambda}_{11}\n", " (\\mathbf{x}_1 - \\boldsymbol{\\mu}_1) +\n", " 2 (\\mathbf{x}_1 - \\boldsymbol{\\mu}_1)^\\top\n", " \\boldsymbol{\\Lambda}_{21}^\\top\n", " (\\mathbf{x}_2 - \\boldsymbol{\\mu}_2) +\n", " (\\mathbf{x}_2 - \\boldsymbol{\\mu}_2)^\\top\n", " \\boldsymbol{\\Lambda}_{22} (\\mathbf{x}_2 - \\boldsymbol{\\mu}_2)\n", " \\right)\n", " \\right] d\\mathbf{x}_2\\\\\n", " &= \\int\n", " \\NormDist_{\\mathbf{x}_1}\\left[\n", " \\boldsymbol{\\mu}_1, \\boldsymbol{\\Sigma}_{11}\n", " \\right]\n", " \\frac{1}{\n", " (2 \\pi)^{q / 2}\n", " \\left\\vert \\mathbf{S} \\right\\vert^{1 / 2}\n", " }\n", " \\exp\\left[ -0.5\n", " \\left[\n", " (\\mathbf{x}_2 - \\boldsymbol{\\mu}_2) -\n", " \\boldsymbol{\\Sigma}_{21} \\boldsymbol{\\Sigma}_{11}^{-1}\n", " (\\mathbf{x}_1 - \\boldsymbol{\\mu}_1)\n", " \\right]^\\top\n", " \\mathbf{S}^{-1}\n", " \\left[\n", " (\\mathbf{x}_2 - \\boldsymbol{\\mu}_2) -\n", " \\boldsymbol{\\Sigma}_{21} \\boldsymbol{\\Sigma}_{11}^{-1}\n", " (\\mathbf{x}_1 - \\boldsymbol{\\mu}_1)\n", " \\right]\n", " \\right] d\\mathbf{x}_2\\\\\n", " &= \\NormDist_{\\mathbf{x}_1}\\left[\n", " \\boldsymbol{\\mu}_1, \\boldsymbol{\\Sigma}_{11}\n", " \\right]\n", " \\int\n", " \\NormDist_{\\mathbf{x}_2}\\left[\n", " \\boldsymbol{\\mu}_2 +\n", " \\boldsymbol{\\Sigma}_{21} \\boldsymbol{\\Sigma}_{11}^{-1}\n", " (\\mathbf{x}_1 - \\boldsymbol{\\mu}_1),\n", " \\mathbf{S}\n", " \\right] d\\mathbf{x}_2\\\\\n", " &= \\NormDist_{\\mathbf{x}_1}\\left[\n", " \\boldsymbol{\\mu}_1, \\boldsymbol{\\Sigma}_{11}\n", " \\right]" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 5.4\n", "============\n", "\n", "This is true if and only if it satisfies the definition of matrix inverse:\n", "\n", ".. math::\n", "\n", " M M^{-1} = M^{-1} M = I.\n", "\n", "A simple way to show this is to decompose the block matrix :math:`M` using the\n", "Schur complement of :math:`D` in :math:`M`. The upper, diagonal, and lower\n", "triangular matrices cancels out." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. _prince2012computer-ex-5.5:\n", "\n", "Exercise 5.5\n", "============\n", "\n", "Another expression for :math:`\\boldsymbol{\\Sigma}^{-1}` in\n", ":ref:`Exercise 5.3 ` is\n", "\n", ".. math::\n", "\n", " \\boldsymbol{\\Sigma}^{-1}\n", " &= \\left(\n", " \\begin{bmatrix}\n", " \\mathbf{I}_p &\n", " \\boldsymbol{\\Sigma}_{21}^\\top \\boldsymbol{\\Sigma}_{22}^{-1}\\\\\n", " \\boldsymbol{0} & \\mathbf{I}_q\\\\\n", " \\end{bmatrix}\n", " \\begin{bmatrix}\n", " \\mathbf{S} & \\boldsymbol{0}\\\\\n", " \\boldsymbol{0} & \\boldsymbol{\\Sigma}_{22}\n", " \\end{bmatrix}\n", " \\begin{bmatrix}\n", " \\mathbf{I}_p & \\boldsymbol{0}\\\\\n", " \\boldsymbol{\\Sigma}_{22}^{-1} \\boldsymbol{\\Sigma}_{21} & \\mathbf{I}_q\n", " \\end{bmatrix}\n", " \\right)^{-1}\\\\\n", " &= \\begin{bmatrix}\n", " \\mathbf{I}_p & \\boldsymbol{0}\\\\\n", " -\\boldsymbol{\\Sigma}_{22}^{-1} \\boldsymbol{\\Sigma}_{21} & \\mathbf{I}_q\n", " \\end{bmatrix}\n", " \\begin{bmatrix}\n", " \\mathbf{S}^{-1} & \\boldsymbol{0}\\\\\n", " \\boldsymbol{0} & \\boldsymbol{\\Sigma}_{22}^{-1}\n", " \\end{bmatrix}\n", " \\begin{bmatrix}\n", " \\mathbf{I}_p &\n", " -\\boldsymbol{\\Sigma}_{21}^\\top \\boldsymbol{\\Sigma}_{22}^{-1}\\\\\n", " \\boldsymbol{0} & \\mathbf{I}_q\\\\\n", " \\end{bmatrix}\\\\\n", " &= \\begin{bmatrix}\n", " \\mathbf{S}^{-1} &\n", " -\\mathbf{S}^{-1}\n", " \\boldsymbol{\\Sigma}_{21}^\\top \\boldsymbol{\\Sigma}_{22}^{-1}\\\\\n", " -\\boldsymbol{\\Sigma}_{22}^{-1}\n", " \\boldsymbol{\\Sigma}_{21} \\mathbf{S}^{-1} &\n", " \\boldsymbol{\\Sigma}_{22}^{-1} +\n", " \\boldsymbol{\\Sigma}_{22}^{-1}\n", " \\boldsymbol{\\Sigma}_{21}\n", " \\mathbf{S}^{-1}\n", " \\boldsymbol{\\Sigma}_{21}^\\top \\boldsymbol{\\Sigma}_{22}^{-1}\n", " \\end{bmatrix}\n", "\n", "where\n", "\n", ".. math::\n", "\n", " \\mathbf{S} =\n", " \\boldsymbol{\\Sigma}_{11} -\n", " \\boldsymbol{\\Sigma}_{21}^\\top\n", " \\boldsymbol{\\Sigma}_{22}^{-1}\n", " \\boldsymbol{\\Sigma}_{21}\n", "\n", "is the Schur complement of :math:`\\boldsymbol{\\Sigma}_{22}` in\n", ":math:`\\boldsymbol{\\Sigma}`. The determinant of :math:`\\boldsymbol{\\Sigma}` is\n", "simplified to\n", "\n", ".. math::\n", "\n", " \\left\\vert \\boldsymbol{\\Sigma} \\right\\vert =\n", " \\left\\vert \\mathbf{S} \\right\\vert\n", " \\left\\vert \\boldsymbol{\\Sigma}_{22} \\right\\vert.\n", "\n", "Going through the same motions gives\n", "\n", ".. math::\n", "\n", " & Pr(\\mathbf{x}_1, \\mathbf{x}_2)\\\\\n", " &= \\frac{1}{\n", " (2 \\pi)^{(p + q) / 2}\n", " \\left\\vert \\mathbf{S} \\right\\vert^{1 / 2}\n", " \\left\\vert \\boldsymbol{\\Sigma}_{22} \\right\\vert^{1 / 2}\n", " }\n", " \\exp\\left[\n", " \\left(\n", " \\left( \\mathbf{x}_1 - \\boldsymbol{\\mu}_1 \\right)^\\top\n", " \\boldsymbol{\\Lambda}_{11}\n", " \\left( \\mathbf{x}_1 - \\boldsymbol{\\mu}_1 \\right) +\n", " 2 \\left( \\mathbf{x}_1 - \\boldsymbol{\\mu}_1 \\right)^\\top\n", " \\boldsymbol{\\Lambda}_{21}^\\top\n", " \\left( \\mathbf{x}_2 - \\boldsymbol{\\mu}_2 \\right) +\n", " \\left( \\mathbf{x}_2 - \\boldsymbol{\\mu}_2 \\right)^\\top\n", " \\boldsymbol{\\Lambda}_{22}\n", " \\left( \\mathbf{x}_2 - \\boldsymbol{\\mu}_2 \\right)\n", " \\right)\n", " \\right]^{-0.5}\\\\\n", " &= \\NormDist_{\\mathbf{x}_2}\\left[\n", " \\boldsymbol{\\mu}_2, \\boldsymbol{\\Sigma}_{22}\n", " \\right]\n", " \\frac{1}{(2 \\pi)^{p / 2} \\left\\vert \\mathbf{S} \\right\\vert^{1 / 2}}\n", " \\exp\\left[\n", " \\left(\n", " \\left( \\mathbf{x}_1 - \\boldsymbol{\\mu}_1 \\right) -\n", " \\boldsymbol{\\Sigma}_{21}^\\top \\boldsymbol{\\Sigma}_{22}^{-1}\n", " \\left( \\mathbf{x}_2 - \\boldsymbol{\\mu}_2 \\right)\n", " \\right)^\\top\n", " \\mathbf{S}^{-1}\n", " \\left(\n", " \\left( \\mathbf{x}_1 - \\boldsymbol{\\mu}_1 \\right) -\n", " \\boldsymbol{\\Sigma}_{21}^\\top \\boldsymbol{\\Sigma}_{22}^{-1}\n", " \\left( \\mathbf{x}_2 - \\boldsymbol{\\mu}_2 \\right)\n", " \\right)\n", " \\right]^{-0.5} d\\mathbf{x}_2\\\\\n", " &= \\NormDist_{\\mathbf{x}_2}\\left[\n", " \\boldsymbol{\\mu}_2, \\boldsymbol{\\Sigma}_{22}\n", " \\right]\n", " \\NormDist_{\\mathbf{x}_1}\\left[\n", " \\boldsymbol{\\mu}_1 +\n", " \\boldsymbol{\\Sigma}_{21}^\\top \\boldsymbol{\\Sigma}_{22}^{-1}\n", " (\\mathbf{x}_2 - \\boldsymbol{\\mu}_2),\n", " \\mathbf{S}\n", " \\right].\n", "\n", "Rearranging the equations using conditional probability (2.4) results in\n", "\n", ".. math::\n", "\n", " Pr(\\mathbf{x}_1 \\mid \\mathbf{x}_2) =\n", " \\frac{Pr(\\mathbf{x}_1, \\mathbf{x}_2)}{Pr(\\mathbf{x}_2)} =\n", " \\NormDist_{\\mathbf{x}_1}\\left[\n", " \\boldsymbol{\\mu}_1 +\n", " \\boldsymbol{\\Sigma}_{21}^\\top \\boldsymbol{\\Sigma}_{22}^{-1}\n", " (\\mathbf{x}_2 - \\boldsymbol{\\mu}_2),\n", " \\mathbf{S}\n", " \\right]." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 5.6\n", "============\n", "\n", "When the covariance is diagonal (i.e. the individual variables are independent),\n", "the off-diagonal elements (e.g. :math:`\\boldsymbol{\\Sigma}_{21}^\\top`) in\n", ":ref:`Exercise 5.5 ` will be zero. Thus\n", "\n", ".. math::\n", "\n", " Pr(\\mathbf{x}_1 \\mid \\mathbf{x}_2)\n", " &= \\NormDist_{\\mathbf{x}_1}\\left[\n", " \\boldsymbol{\\mu}_1 +\n", " \\boldsymbol{\\Sigma}_{21}^\\top \\boldsymbol{\\Sigma}_{22}^{-1}\n", " (\\mathbf{x}_2 - \\boldsymbol{\\mu}_2),\n", " \\boldsymbol{\\Sigma}_{11} -\n", " \\boldsymbol{\\Sigma}_{21}^\\top\n", " \\boldsymbol{\\Sigma}_{22}^{-1} \\boldsymbol{\\Sigma}_{21}\n", " \\right]\\\\\n", " &= \\NormDist_{\\mathbf{x}_1}\\left[\n", " \\boldsymbol{\\mu}_1, \\boldsymbol{\\Sigma}_{11}\n", " \\right]\\\\\n", " &= Pr(\\mathbf{x}_1)." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. _prince2012computer-ex-5.7:\n", "\n", "Exercise 5.7\n", "============\n", "\n", "Let :math:`x, a, b \\in \\mathbb{R}^D` and\n", ":math:`A, B \\in \\mathbb{R}^{D \\times D}`.\n", "\n", ".. math::\n", "\n", " & \\NormDist_{x}[a, A] \\NormDist_{x}[b, B]\\\\\n", " &= \\frac{1}{\\left\\vert 2 \\pi A \\right\\vert^{1 / 2}}\n", " \\exp\\left[ (x - a)^\\top A^{-1} (x - a) \\right]^{-0.5}\n", " \\frac{1}{\\left\\vert 2 \\pi B \\right\\vert^{1 / 2}}\n", " \\exp\\left[ (x - b)^\\top B^{-1} (x - b) \\right]^{-0.5}\\\\\n", " &= \\frac{1}{(2 \\pi)^{D} \\left\\vert AB \\right\\vert^{1 / 2}}\n", " \\exp\\left[\n", " x^\\top A^{-1} x - 2 x^\\top A^{-1} a + a^\\top A^{-1} a +\n", " x^\\top B^{-1} x - 2 x^\\top B^{-1} b + b^\\top B^{-1} b\n", " \\right]^{-0.5}\\\\\n", " &= \\frac{1}{(2 \\pi)^{D} \\left\\vert AB \\right\\vert^{1 / 2}}\n", " \\exp\\left[\n", " x^\\top (A^{-1} + B^{-1}) x - 2 x^\\top (A^{-1} a + B^{-1} b) +\n", " a^\\top A^{-1} a + b^\\top B^{-1} b\n", " \\right]^{-0.5}\n", " & \\quad & \\text{rearrange terms to expose pattern}\\\\\n", " &= \\frac{1}{(2 \\pi)^{D} \\left\\vert AB \\right\\vert^{1 / 2}}\n", " \\exp\\left[\n", " (x - \\boldsymbol{\\mu})^\\top (A^{-1} + B^{-1}) (x - \\boldsymbol{\\mu}) -\n", " \\boldsymbol{\\mu}^\\top (A^{-1} + B^{-1}) \\boldsymbol{\\mu} +\n", " a^\\top A^{-1} a + b^\\top B^{-1} b\n", " \\right]^{-0.5}\n", " & \\quad & \\text{completing the square}\\\\\n", " &= \\frac{\n", " \\left\\vert \\boldsymbol{\\Sigma} \\right\\vert^{1 / 2}\n", " }{\n", " (2 \\pi)^{D / 2} \\left\\vert AB \\right\\vert^{1 / 2}\n", " }\n", " \\exp\\left[\n", " a^\\top A^{-1} a + b^\\top B^{-1} b -\n", " \\boldsymbol{\\mu}^\\top \\boldsymbol{\\Sigma}^{-1} \\boldsymbol{\\mu}\n", " \\right]^{-0.5}\n", " \\NormDist_{x}[\\boldsymbol{\\mu}, \\boldsymbol{\\Sigma}]\\\\\n", " &\\propto \\NormDist_{x}[\\boldsymbol{\\mu}, \\boldsymbol{\\Sigma}]\n", "\n", "where :math:`\\boldsymbol{\\mu} = \\boldsymbol{\\Sigma} (A^{-1} a + B^{-1} b)` and\n", ":math:`\\boldsymbol{\\Sigma} = (A^{-1} + B^{-1})^{-1}`." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 5.8\n", "============\n", "\n", "The results of :ref:`Exercise 5.7 ` illustrate that\n", "the new mean and variance are respectively\n", "\n", ".. math::\n", "\n", " \\mu =\n", " \\frac{\n", " \\sigma_1^{-2} \\mu_1 + \\sigma_2^{-2} \\mu_2\n", " }{\n", " \\sigma_1^{-2} + \\sigma_2^{-2}\n", " } =\n", " a \\mu_1 + b \\mu_2\n", " \\quad \\text{and} \\quad\n", " \\sigma^2 = \\frac{1}{\\sigma_1^{-2} + \\sigma_2^{-2}}\n", "\n", "where :math:`a, b > 0` and :math:`a + b = 1`.\n", "\n", "Assuming :math:`\\sigma_1^2, \\sigma_2^2 > 0`, the following (applicable to both)\n", "shows that the new variance is smaller than either of them:\n", "\n", ".. math::\n", "\n", " \\sigma_1^{-2} + \\sigma_2^{-2} &> \\sigma_1^{-2}\\\\\n", " \\sigma_1^2 &> (\\sigma_1^{-2} + \\sigma_2^{-2})^{-1}\\\\\n", " &> \\sigma^2.\n", "\n", "The variance proof is quite clever in the sense that you start by assuming\n", "what you want (:math:`\\sigma^2 < \\sigma_1^2`) and work backwards to reach some\n", "kind of obviously true proposition\n", "(:math:`\\sigma_1^{-2} + \\sigma_2^{-2} > \\sigma_1^{-2}`) under certain\n", "assumptions. Then present the proof backwards!" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. _prince2012computer-ex-5.9:\n", "\n", "Exercise 5.9\n", "============\n", "\n", ":ref:`Exercise 5.7 ` states that\n", "\n", ".. math::\n", "\n", " \\kappa =\n", " \\frac{\n", " \\left\\vert \\boldsymbol{\\Sigma} \\right\\vert^{1 / 2}\n", " }{\n", " (2 \\pi)^{D / 2} \\left\\vert AB \\right\\vert^{1 / 2}\n", " }\n", " \\exp\\left[\n", " \\left(\n", " a^\\top A^{-1} a + b^\\top B^{-1} b -\n", " \\boldsymbol{\\mu}^\\top \\boldsymbol{\\Sigma}^{-1} \\boldsymbol{\\mu}\n", " \\right)\n", " \\right]^{-0.5}.\n", "\n", "Notice that\n", "\n", ".. math::\n", "\n", " \\frac{\n", " \\left\\vert \\boldsymbol{\\Sigma} \\right\\vert^{1 / 2}\n", " }{\n", " \\left\\vert AB \\right\\vert^{1 / 2}\n", " }\n", " &= \\left(\n", " \\left\\vert AB \\right\\vert\n", " \\left\\vert A^{-1} + B^{-1} \\right\\vert\n", " \\right)^{-1 / 2}\\\\\n", " &= \\left(\n", " \\left\\vert A \\right\\vert\n", " \\left\\vert A^{-1} + B^{-1} \\right\\vert\n", " \\left\\vert B \\right\\vert\n", " \\right)^{-1 / 2}\\\\\n", " &= \\left(\n", " \\left\\vert A (A^{-1} + B^{-1}) B \\right\\vert\n", " \\right)^{-1 / 2}\\\\\n", " &= \\left(\n", " \\left\\vert A + B \\right\\vert\n", " \\right)^{-1 / 2}\n", "\n", "and\n", "\n", ".. math::\n", "\n", " & \\exp\\left[\n", " a^\\top A^{-1} a + b^\\top B^{-1} b -\n", " \\boldsymbol{\\mu}^\\top \\boldsymbol{\\Sigma}^{-1} \\boldsymbol{\\mu}\n", " \\right]^{-0.5}\\\\\n", " &= \\exp\\left[\n", " a^\\top A^{-1} a + b^\\top B^{-1} b -\n", " a^\\top A^{-1} \\boldsymbol{\\Sigma} A^{-1} a -\n", " b^\\top B^{-1} \\boldsymbol{\\Sigma} B^{-1} b -\n", " 2 a^\\top A^{-1} \\boldsymbol{\\Sigma} B^{-1} b\n", " \\right]^{-0.5}\\\\\n", " &= \\exp\\left[\n", " a^\\top A^{-1} a + b^\\top B^{-1} b -\n", " a^\\top (A \\boldsymbol{\\Sigma}^{-1} A)^{-1} a -\n", " b^\\top (B \\boldsymbol{\\Sigma}^{-1} B)^{-1} b -\n", " 2 a^\\top (B \\boldsymbol{\\Sigma}^{-1} A)^{-1} b\n", " \\right]^{-0.5}\\\\\n", " &= \\exp\\left[\n", " a^\\top A^{-1} a + b^\\top B^{-1} b -\n", " a^\\top (A + B)^{-1} B A^{-1} a -\n", " b^\\top (A + B)^{-1} A B^{-1} b -\n", " 2 a^\\top (A + B)^{-1} b\n", " \\right]^{-0.5}\\\\\n", " &= \\exp\\left[\n", " a^\\top \\left( A^{-1} - (A + B)^{-1} B A^{-1} \\right) a +\n", " b^\\top \\left( B^{-1} - (A + B)^{-1} A B^{-1} \\right) b -\n", " 2 a^\\top (A + B)^{-1} b\n", " \\right]^{-0.5}\\\\\n", " &= \\exp\\left[\n", " a^\\top (A + B)^{-1} a - 2 a^\\top (A + B)^{-1} b + b^\\top (A + B)^{-1} b\n", " \\right]^{-0.5}\n", " & \\quad & \\text{(a)}\\\\\n", " &= \\exp\\left[\n", " (a - b) (A + B)^{-1} (a - b)\n", " \\right]^{-0.5}.\n", "\n", "Thus :math:`\\kappa = \\NormDist_{a}[b, A + B]`.\n", "\n", "(a)\n", "---\n", "\n", "One approach to this solution is to assume the desired identities\n", "\n", ".. math::\n", "\n", " A^{-1} - (A + B)^{-1} B A^{-1} &= (A + B)^{-1}\\\\\n", " B^{-1} - (A + B)^{-1} A B^{-1} &= (A + B)^{-1}\n", "\n", "hold and try to solve for :math:`(A + B)^{-1}`. This leads to the following\n", "identities:\n", "\n", ".. math::\n", "\n", " (A + B) A^{-1} &= I + B A^{-1}\\\\\n", " A^{-1} &= (A + B)^{-1} (I + B A^{-1})\\\\\\\\\n", " (A + B) B^{-1} &= A B^{-1} + I\\\\\n", " B^{-1} &= (A + B)^{-1} (A B^{-1} + I).\n", "\n", "The purpose of the assumption is to derive some kind of obviously true\n", "proposition and then work backwards. The solution in the book made use of the\n", "clever observation that\n", "\n", ".. math::\n", "\n", " a^\\top A^{-1} a = a^\\top (A + B)^{-1} (A + B) A^{-1} a =\n", " a^\\top (A + B)^{-1} a + a^\\top (A + B)^{-1} B A^{-1} a\n", "\n", "and\n", "\n", ".. math::\n", "\n", " b^\\top B^{-1} b = b^\\top (A + B)^{-1} (A + B) B^{-1} b =\n", " b^\\top (A + B)^{-1} b + b^\\top (A + B)^{-1} A B^{-1} b." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. _prince2012computer-ex-5.10:\n", "\n", "Exercise 5.10\n", "=============\n", "\n", "Suppose :math:`x \\in \\mathbb{R}^n`, :math:`A \\in \\mathbb{R}^{n \\times m}`,\n", ":math:`y \\in \\mathbb{R}^m`, :math:`b \\in \\mathbb{R}^n`, and\n", ":math:`\\Sigma \\in \\mathbb{R}^{n \\times n}`.\n", "\n", ".. math::\n", "\n", " & \\NormDist_x[Ay + b, \\Sigma]\\\\\n", " &= \\frac{1}{\\left\\vert 2 \\pi \\Sigma \\right\\vert^{1 / 2}}\n", " \\exp\\left[\n", " (x - Ay - b)^\\top \\Sigma^{-1} (x - Ay - b)\n", " \\right]^{-0.5}\\\\\n", " &= \\frac{1}{(2 \\pi)^{n / 2} \\left\\vert \\Sigma \\right\\vert^{1 / 2}}\n", " \\exp\\left[\n", " x^\\top \\Sigma^{-1} x^\\top - 2 x^\\top \\Sigma^{-1} A y -\n", " 2 x^\\top \\Sigma^{-1} b + y^\\top A^\\top \\Sigma^{-1} Ay +\n", " 2 y^\\top A^\\top \\Sigma^{-1} b + b^\\top \\Sigma^{-1} b\n", " \\right]^{-0.5}\\\\\n", " &= \\frac{1}{(2 \\pi)^{n / 2} \\left\\vert \\Sigma \\right\\vert^{1 / 2}}\n", " \\exp\\left[\n", " x^\\top \\Sigma^{-1} x^\\top - 2 x^\\top \\Sigma^{-1} b +\n", " b^\\top \\Sigma^{-1} b\n", " \\right]^{-0.5}\n", " \\exp\\left[\n", " y^\\top A^\\top \\Sigma^{-1} A y -\n", " 2 y^\\top A^\\top \\Sigma^{-1} (x - b)\n", " \\right]^{-0.5}\\\\\n", " &= \\kappa_1\n", " \\exp\\left[\n", " y^\\top A^\\top \\Sigma^{-1} A y -\n", " 2 y^\\top A^\\top \\Sigma^{-1} (x - b)\n", " \\right]^{-0.5}\\\\\n", " &= \\kappa_1\n", " \\exp\\left[\n", " \\left(\n", " y - \\Sigma' A^\\top \\Sigma^{-1} (x - b)\n", " \\right)^\\top\n", " \\Sigma'^{-1}\n", " \\left(\n", " y - \\Sigma' A^\\top \\Sigma^{-1} (x - b)\n", " \\right) -\n", " (x - b)^\\top \\Sigma^{-1} A \\Sigma' A^\\top \\Sigma^{-1} (x - b)\n", " \\right]^{-0.5}\\\\\n", " &= \\kappa_1\n", " \\exp\\left[ (A' x + b')^\\top \\Sigma'^{-1} (A' x + b') \\right]^{0.5}\n", " \\exp\\left[\n", " \\left(\n", " y - (A' x + b')\n", " \\right)^\\top\n", " \\Sigma'^{-1}\n", " \\left(\n", " y - (A' x + b')\n", " \\right)\n", " \\right]^{-0.5}\\\\\n", " &= \\kappa_2 \\left\\vert 2 \\pi \\Sigma' \\right\\vert^{1 / 2}\n", " \\NormDist_y[A' x + b', \\Sigma']\\\\\n", " &= \\kappa \\NormDist_y[A' x + b', \\Sigma']\n", "\n", "where\n", "\n", ".. math::\n", "\n", " \\Sigma' &= (A^\\top \\Sigma^{-1} A)^{-1} \\in \\mathbb{R}^{m \\times m}\\\\\\\\\n", " A' &= \\Sigma' A^\\top \\Sigma^{-1} \\in \\mathbb{R}^{m \\times n}\\\\\\\\\n", " b' &= -\\Sigma' A^\\top \\Sigma^{-1} b \\in \\mathbb{R}^{m \\times 1}\n", "\n", ".. math::\n", "\n", " \\kappa\n", " &= (2 \\pi)^{(m - n) / 2}\n", " \\frac{\n", " \\left\\vert \\Sigma' \\right\\vert^{1 / 2}\n", " }{\n", " \\left\\vert \\Sigma \\right\\vert^{1 / 2}\n", " }\n", " \\exp\\left[\n", " x^\\top \\Sigma^{-1} x - 2 x^\\top \\Sigma^{-1} b + b^\\top \\Sigma^{-1} b\n", " \\right]^{-0.5}\n", " \\exp\\left[\n", " (A' x + b')^\\top \\Sigma'^{-1} (A' x + b')\n", " \\right]^{0.5}\\\\\n", " &= (2 \\pi)^{(m - n) / 2}\n", " \\frac{\n", " \\left\\vert \\Sigma' \\right\\vert^{1 / 2}\n", " }{\n", " \\left\\vert \\Sigma \\right\\vert^{1 / 2}\n", " }\n", " \\exp\\left[\n", " (x - b)^\\top\n", " \\left(\n", " \\Sigma^{-1} - \\Sigma^{-1} A \\Sigma' A^\\top \\Sigma^{-1}\n", " \\right) (x - b)\n", " \\right]^{-0.5}." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. rubric:: References\n", "\n", ".. bibliography:: chapter-05.bib" ] } ], "metadata": { "anaconda-cloud": {}, "celltoolbar": "Raw Cell Format", "kernelspec": { "display_name": "Python [default]", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.2" } }, "nbformat": 4, "nbformat_minor": 0 }