{ "cells": [ { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "****************\n", "Models for Grids\n", "****************\n", "\n", "Markov Random Fields\n", "====================\n", "\n", "A Markov random field is essentially an undirected graphical model that\n", "describes the joint probability of the variables (12.2)\n", "\n", ".. math::\n", "\n", " Pr(\\mathbf{w}) =\n", " Z^{-1} \\prod_{j = 1}^J \\phi_j\\left[ \\mathbf{w}_{\\mathcal{C}_j} \\right].\n", "\n", "It is typically solved with graph cut via max flow.\n", "\n", "Conditional Random Fields\n", "=========================\n", "\n", "Conditional random fields is a discriminative model for the joint probability\n", "distribution :math:`Pr(\\mathbf{w}, \\mathbf{x})` of undirected graphical models\n", "(12.23). It is also solved via graph cut.\n", "\n", "Higher Order Models\n", "===================\n", "\n", "Instead of treating each variable :math:`w_n` as a pixel, they can instead\n", "represent a patch. This is less likely to be submodular or even obey the\n", "triangle inequality. The number of :math:`K` patches may be very inefficient,\n", "and optimizing the resulting cost function is hard.\n", "\n", "Directed Models for Grids\n", "=========================\n", "\n", "Instead of using the original undirected graphical model, use an approximate\n", "directed graphical model. Unfortunately, there is no known polynomial algorithm\n", "to optimize resulting cost function that contains three-wise terms. Approximate\n", "inference methods would sample from the model." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 12.1\n", "=============\n", "\n", "Let :math:`\\phi[a, b] = \\exp\\left[ -(a - b)^2 \\right]` and\n", ":math:`\\mathbf{x} = \\begin{bmatrix} x_1 & x_2 & x_3 & x_4 \\end{bmatrix}^\\top`.\n", "\n", ".. math::\n", "\n", " Pr(x_1, x_2, x_3, x_4)\n", " &= \\frac{1}{Z}\n", " \\phi[x_1, x_2] \\phi[x_2, x_3] \\phi[x_3, x_4] \\phi[x_4, x_1]\n", " & \\quad & \\text{(10.9)}\\\\\n", " &= \\frac{1}{Z} \\exp\\left[\n", " -(x_1 - x_2)^2 - (x_2 - x_3)^2 - (x_3 - x_4)^2 - (x_4 - x_1)^2\n", " \\right]\\\\\n", " &= \\frac{1}{Z} \\exp\\left[\n", " (x_1^2 - 2 x_1 x_2 + x_2^2) + (x_2^2 - 2 x_2 x_3 + x_3^2) +\n", " (x_3^2 - 2 x_3 x_4 + x_4^2) + (x_4^2 - 2 x_4 x_1 + x_1^2)\n", " \\right]^{-1}\\\\\n", " &= \\frac{1}{Z} \\exp\\left[\n", " 2 x_1^2 - 2 x_1 x_2 - 2 x_1 x_4 + 2 x_2^2 - 2 x_2 x_3 + 2 x_3^2 -\n", " 2 x_3 x_4 + 2 x_4^2\n", " \\right]\\\\\n", " &= \\frac{1}{Z} \\exp\\left[\n", " \\mathbf{x}^\\top \\boldsymbol{\\Sigma}^{-1} \\mathbf{x}\n", " \\right]\n", "\n", "where the inverse of the covariance matrix is\n", "\n", ".. math::\n", "\n", " \\boldsymbol{\\Sigma}^{-1} =\n", " \\begin{bmatrix}\n", " 2 & -1 & 0 & -1\\\\\n", " -1 & 2 & -1 & 0\\\\\n", " 0 & -1 & 2 & -1\\\\\n", " -1 & 0 & -1 & 2\n", " \\end{bmatrix}." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 12.2\n", "=============\n", "\n", "Figure 12.9 shows the eight possible cuts for this model.\n", "\n", "(i)\n", "---\n", "\n", ".. math::\n", "\n", " \\text{(a)} &\\colon U_a(0) + U_b(0) + U_c(0) = 12.7\\\\\n", " \\text{(b)} &\\colon U_a(0) + U_b(0) + U_c(1) + P_{bc}(0,1) = 12.9\\\\\n", " \\text{(c)}\n", " &\\colon U_a(0) + U_b(1) + U_c(0) + P_{ab}(0,1) + P_{bc}(1,1) = 23.1\\\\\n", " \\text{(d)} &\\colon U_a(0) + U_b(1) + U_c(1) + P_{ab}(0,1) = 15.2\\\\\n", " \\text{(e)} &\\colon U_a(1) + U_b(0) + U_c(0) + P_{ab}(1,0) = 21\\\\\n", " \\text{(f)}\n", " &\\colon U_a(1) + U_b(0) + U_c(1) + P_{ab}(1,0) + P_{bc}(0,1) = 21.2\\\\\n", " \\text{(g)} &\\colon U_a(1) + U_b(1) + U_c(0) + P_{bc}(1,0) = 26.2\\\\\n", " \\text{(h)} &\\colon U_a(1) + U_b(1) + U_c(1) = 18.3\n", "\n", "The minimum cut is (a).\n", "\n", "(ii)\n", "----\n", "\n", "Running Ford-Fulkerson on the graph yields the same value for maximum flow i.e.\n", "maximum flow of a graph equals capacity of minimum cut." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 12.3\n", "=============\n", "\n", "The following are based on Figure 12.10.\n", "\n", ":math:`(a = 0, b = 0)`\n", "----------------------\n", "\n", ".. math::\n", "\n", " U_a(0) + U_b(0) + P_{ab}(0,0)\n", "\n", ":math:`(a = 0, b = 1)`\n", "----------------------\n", "\n", ".. math::\n", "\n", " U_a(0) + U_b(1) + P_{ab}(0,1)\n", "\n", ":math:`(a = 1, b = 0)`\n", "----------------------\n", "\n", ".. math::\n", "\n", " \\left[ U_a(1) + P_{ab}(1,1) \\right] + \\left[ U_b(0) + P_{ab}(0,0) \\right] +\n", " \\left[ P_{ab}(1,0) - P_{ab}(1,1) - P_{ab}(0,0) \\right] =\n", " U_a(1) + U_b(0) + P_{ab}(1,0)\n", "\n", ":math:`(a = 1, b = 1)`\n", "----------------------\n", "\n", ".. math::\n", "\n", " U_a(1) + U_b(1) + P_{ab}(1,1)" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 12.4\n", "=============\n", "\n", "The following are based on Figure 12.11c.\n", "\n", ":math:`(a = 0, b = 0)`\n", "----------------------\n", "\n", ".. math::\n", "\n", " U_a(0) + U_b(0) + \\beta\n", "\n", ":math:`(a = 0, b = 1)`\n", "----------------------\n", "\n", ".. math::\n", "\n", " U_a(0) + U_b(1) + P_{ab}(0,1) + \\beta\n", "\n", ":math:`(a = 1, b = 0)`\n", "----------------------\n", "\n", ".. math::\n", "\n", " \\left[ U_a(1) + \\beta \\right] + \\left[ U_b(0) + \\beta \\right] +\n", " \\left[ P_{ab}(1,0) - \\beta \\right] =\n", " U_a(1) + U_b(0) + P_{ab}(1,0) + \\beta\n", "\n", ":math:`(a = 1, b = 1)`\n", "----------------------\n", "\n", ".. math::\n", "\n", " U_a(1) + U_b(1) + \\beta" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. _prince2012computer-ex-12.5:\n", "\n", "Exercise 12.5\n", "=============\n", "\n", "Let :math:`C[w_a, w_b]` denote the minimum cut capacity where\n", ":math:`w_a, w_b \\in \\{ 1, 2, 3, 4, 5 \\}`.\n", "\n", ".. math::\n", "\n", " C[1,1] &= U_a(1) + U_b(1)\\\\\n", " C[1,2] &= U_a(1) + U_b(2) + P_{ab}(1,2)\\\\\n", " C[1,3] &= U_a(1) + U_b(3) + P_{ab}(1,2) + P_{ab}(2,3) + \\infty = \\infty\\\\\n", " C[1,4] &= U_a(1) + U_b(4) + P_{ab}(1,2) + P_{ab}(2,3) + P_{ab}(3,4) +\n", " (2) \\infty = \\infty\\\\\n", " C[1,4] &= U_a(1) + U_b(4) + P_{ab}(1,2) + P_{ab}(2,3) + P_{ab}(3,4) +\n", " P_{ab}(4,5) + (3) \\infty = \\infty\\\\\\\\\\\\\n", " C[2,1] &= U_a(2) + U_b(1) + P_{ab}(2,1)\\\\\n", " C[2,2] &= U_a(2) + U_b(2)\\\\\n", " C[2,3] &= U_a(2) + U_b(3) + P_{ab}(2,3)\\\\\n", " C[2,4] &= U_a(2) + U_b(4) + P_{ab}(2,3) + P_{ab}(3,4) + \\infty = \\infty\\\\\n", " C[2,5] &= U_a(2) + U_b(5) + P_{ab}(2,3) + P_{ab}(3,4) + P_{ab}(4,5) + \\infty\n", " = \\infty\\\\\\\\\\\\\n", " C[3,1] &= U_a(3) + U_b(1) + P_{ab}(2,1) + P_{ab}(3,2) + \\infty = \\infty\\\\\n", " C[3,2] &= U_a(3) + U_b(2) + P_{ab}(3,2)\\\\\n", " C[3,3] &= U_a(3) + U_b(3)\\\\\n", " C[3,4] &= U_a(3) + U_b(4) + P_{ab}(3,4)\\\\\n", " C[3,5] &= U_a(3) + U_b(5) + P_{ab}(3,4) + P_{ab}(4,5) + \\infty = \\infty\\\\\\\\\\\\\n", " C[4,1] &= U_a(4) + U_b(1) + P_{ab}(2,1) + P_{ab}(3,2) + P_{ab}(4,3) +\n", " (2) \\infty = \\infty\\\\\n", " C[4,2] &= U_a(4) + U_b(2) + P_{ab}(3,2) + P_{ab}(4,3) + \\infty = \\infty\\\\\n", " C[4,3] &= U_a(4) + U_b(3) + P_{ab}(4,3)\\\\\n", " C[4,4] &= U_a(4) + U_b(4)\\\\\n", " C[4,5] &= U_a(4) + U_b(5) + P_{ab}(4,5)\\\\\\\\\\\\\n", " C[5,1] &= U_a(5) + U_b(1) + P_{ab}(2,1) + P_{ab}(3,2) + P_{ab}(4,3) +\n", " P_{ab}(5,4) + (3) \\infty = \\infty\\\\\n", " C[5,2] &= U_a(5) + U_b(2) + P_{ab}(3,2) + P_{ab}(4,3) + P_{ab}(5,4) +\n", " (2) \\infty = \\infty\\\\\n", " C[5,3] &= U_a(5) + U_b(3) + P_{ab}(4,3) + P_{ab}(5,4) + \\infty = \\infty\\\\\n", " C[5,4] &= U_a(5) + U_b(4) + P_{ab}(5,4)\\\\\n", " C[5,5] &= U_a(5) + U_b(5)\n", "\n", ":math:`C[w_a, w_b]` is finite when\n", ":math:`\\left\\vert w_a - w_b \\right\\vert \\leq 1`." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 12.6\n", "=============\n", "\n", "Adding an additional pixel :math:`c` will only increase the capacity of the\n", "minimum cut. This means the cuts from\n", ":ref:`Exercise 12.5 ` that have infinite cost will\n", "still be categorized as infinite with the additional capacity. Hence the\n", ":math:`5^3` possible minimum cuts are reduced to\n", ":math:`5 (2 + 3 + 3 + 3 + 2) = 65`.\n", "\n", "The possible combinations are further reduced from the observation that the\n", "edges with infinite capacity are fully connected to the previous layer i.e.\n", ":math:`C[w_a, w_b, w_c]` is finite when\n", "\n", ".. math::\n", "\n", " \\left\\vert w_a - w_b \\right\\vert \\leq 1\n", " \\qquad \\land \\qquad\n", " \\left\\vert w_b - w_c \\right\\vert \\leq 1\n", " \\qquad \\land \\qquad\n", " \\left\\vert w_c - w_a \\right\\vert \\leq 1." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 12.7\n", "=============\n", "\n", "Example 12.14 consists of two pixels with pairwise connections that can take\n", "on :math:`K = 4` different labels.\n", "\n", ".. math::\n", "\n", " U_a(4) + U_b(4) + \\sum_{i = 1}^4 \\sum_{j = 5}5 C_{ab}(i,j)\n", " &= U_a(4) + U_b(4) +\n", " \\left( P_{ab}(1,4) + P_{ab}(0,5) - P_{ab}(1,5) - P_{ab}(0,4) \\right) +\\\\\n", " &\\quad\n", " \\left( P_{ab}(2,4) + P_{ab}(1,5) - P_{ab}(2,5) - P_{ab}(1,4) \\right) +\\\\\n", " &\\quad\n", " \\left( P_{ab}(3,4) + P_{ab}(2,5) - P_{ab}(3,5) - P_{ab}(2,4) \\right) +\\\\\n", " &\\quad\n", " \\left( P_{ab}(4,4) + P_{ab}(3,5) - P_{ab}(4,5) - P_{ab}(3,4) \\right)\n", " & \\quad & \\text{(12.17)}\\\\\n", " &= U_a(4) + U_b(4) +\n", " \\left( P_{ab}(1,4) \\right) + \\left( P_{ab}(2,4) - P_{ab}(1,4) \\right) +\\\\\n", " &\\quad\n", " \\left( P_{ab}(3,4) - P_{ab}(2,4) \\right) +\n", " \\left( P_{ab}(4,4) - P_{ab}(3,4) \\right)\n", " & \\quad & \\text{(12.18)}\\\\\n", " &= U_a(4) + U_b(4) + P_{ab}(4,4)\n", " & \\quad & \\text{(12.19)}" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 12.8\n", "=============\n", "\n", "The Potts model is\n", ":math:`P_{mn}(w_m, w_n) = \\kappa (1 - \\boldsymbol{\\delta}(w_m, w_n))`. The\n", "generalized multi-label submodularity condition for the Potts model is\n", "\n", ".. math::\n", "\n", " & P_{ab}(\\beta, \\gamma) + P_{ab}(\\alpha, \\delta) -\n", " P_{ab}(\\beta, \\delta) - P_{ab}(\\alpha, \\gamma)\n", " & \\quad & \\text{(12.21)}\\\\\n", " &= \\kappa (1 - \\boldsymbol{\\delta}(\\beta, \\gamma)) +\n", " \\kappa (1 - \\boldsymbol{\\delta}(\\alpha, \\delta)) -\n", " \\kappa (1 - \\boldsymbol{\\delta}(\\beta, \\delta)) -\n", " \\kappa (1 - \\boldsymbol{\\delta}(\\alpha, \\gamma))\\\\\n", " &= \\kappa \\left(\n", " -\\boldsymbol{\\delta}(\\beta, \\gamma) -\n", " \\boldsymbol{\\delta}(\\alpha, \\delta) +\n", " \\boldsymbol{\\delta}(\\beta, \\delta) +\n", " \\boldsymbol{\\delta}(\\alpha, \\gamma)\n", " \\right)\n", "\n", "where :math:`\\beta > \\alpha` and :math:`\\delta > \\gamma`. This condition is no\n", "longer non-negative when :math:`\\delta = \\alpha`, which results in\n", ":math:`\\beta > \\delta > \\gamma`." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Exercise 12.9\n", "=============\n", "\n", "The following explains the relationship of the nodes in Figure 12.3's depiction\n", "of the :math:`\\alpha-\\beta` graphical model. Note that this graphical model\n", "formulation does not invoke the triangle inequality.\n", "\n", "Each vertex is connected to the source (representing the label :math:`\\alpha`)\n", "and sink (representing the label :math:`\\beta`).\n", "\n", "Source and Sink Edge Weights\n", "----------------------------\n", "\n", "The edges connected to the source and sink have a unary cost of\n", ":math:`U_{\\cdot}(\\alpha)` and :math:`U_{\\cdot}(\\beta)` respectively where\n", ":math:`\\cdot \\in \\{ a, b, c, d \\}`.\n", "\n", "Pixels :math:`i` and :math:`j` have labels :math:`\\{ (\\gamma, \\gamma) \\}`\n", "-------------------------------------------------------------------------\n", "\n", "Since :math:`e` has a label of :math:`\\gamma` and the current swap is for\n", ":math:`(\\alpha, \\beta)`, there are no edges connecting to pixel node :math:`e`.\n", "Consequently, nodes which are neither :math:`\\alpha` nor :math:`\\beta` do not\n", "need to be included in the graph.\n", "\n", "Pixels :math:`i` and :math:`j` have labels :math:`\\{ (\\alpha, \\alpha), (\\alpha, \\beta), (\\beta, \\beta) \\}`\n", "----------------------------------------------------------------------------------------------------------\n", "\n", "The final configuration consists of :math:`\\alpha - \\alpha` with zero pairwise\n", "cost, :math:`\\alpha - \\beta` with pairwise cost :math:`P_{ij}(\\alpha, \\beta)`,\n", ":math:`\\beta - \\alpha` with pairwise cost :math:`P_{ij}(\\beta, \\alpha)`, and\n", ":math:`\\beta - \\beta` with zero pairwise cost.\n", "\n", "Pixels :math:`i` and :math:`j` have labels :math:`\\{ (\\alpha, \\gamma), (\\beta, \\gamma) \\}`\n", "------------------------------------------------------------------------------------------\n", "\n", "The final configuration consists of :math:`\\alpha - \\gamma` with pairwise cost\n", ":math:`P_{ij}(\\alpha, \\gamma)`, and :math:`\\beta - \\gamma` with pairwise cost\n", ":math:`P_{ij}(\\beta, \\gamma)`. If these costs were added as edges to pixel node\n", ":math:`j`, it will never be part of the minimum cut. The only way to include\n", "this cost is to add them to the source and sink edges of pixel node :math:`i`\n", "respectively." ] } ], "metadata": { "anaconda-cloud": {}, "celltoolbar": "Raw Cell Format", "kernelspec": { "display_name": "Python [default]", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.2" } }, "nbformat": 4, "nbformat_minor": 0 }