Models for Grids

Markov Random Fields

A Markov random field is essentially an undirected graphical model that describes the joint probability of the variables (12.2)

\[Pr(\mathbf{w}) = Z^{-1} \prod_{j = 1}^J \phi_j\left[ \mathbf{w}_{\mathcal{C}_j} \right].\]

It is typically solved with graph cut via max flow.

Conditional Random Fields

Conditional random fields is a discriminative model for the joint probability distribution \(Pr(\mathbf{w}, \mathbf{x})\) of undirected graphical models (12.23). It is also solved via graph cut.

Higher Order Models

Instead of treating each variable \(w_n\) as a pixel, they can instead represent a patch. This is less likely to be submodular or even obey the triangle inequality. The number of \(K\) patches may be very inefficient, and optimizing the resulting cost function is hard.

Directed Models for Grids

Instead of using the original undirected graphical model, use an approximate directed graphical model. Unfortunately, there is no known polynomial algorithm to optimize resulting cost function that contains three-wise terms. Approximate inference methods would sample from the model.

Exercise 12.1

Let \(\phi[a, b] = \exp\left[ -(a - b)^2 \right]\) and \(\mathbf{x} = \begin{bmatrix} x_1 & x_2 & x_3 & x_4 \end{bmatrix}^\top\).

\[\begin{split}Pr(x_1, x_2, x_3, x_4) &= \frac{1}{Z} \phi[x_1, x_2] \phi[x_2, x_3] \phi[x_3, x_4] \phi[x_4, x_1] & \quad & \text{(10.9)}\\ &= \frac{1}{Z} \exp\left[ -(x_1 - x_2)^2 - (x_2 - x_3)^2 - (x_3 - x_4)^2 - (x_4 - x_1)^2 \right]\\ &= \frac{1}{Z} \exp\left[ (x_1^2 - 2 x_1 x_2 + x_2^2) + (x_2^2 - 2 x_2 x_3 + x_3^2) + (x_3^2 - 2 x_3 x_4 + x_4^2) + (x_4^2 - 2 x_4 x_1 + x_1^2) \right]^{-1}\\ &= \frac{1}{Z} \exp\left[ 2 x_1^2 - 2 x_1 x_2 - 2 x_1 x_4 + 2 x_2^2 - 2 x_2 x_3 + 2 x_3^2 - 2 x_3 x_4 + 2 x_4^2 \right]\\ &= \frac{1}{Z} \exp\left[ \mathbf{x}^\top \boldsymbol{\Sigma}^{-1} \mathbf{x} \right]\end{split}\]

where the inverse of the covariance matrix is

\[\begin{split}\boldsymbol{\Sigma}^{-1} = \begin{bmatrix} 2 & -1 & 0 & -1\\ -1 & 2 & -1 & 0\\ 0 & -1 & 2 & -1\\ -1 & 0 & -1 & 2 \end{bmatrix}.\end{split}\]

Exercise 12.2

Figure 12.9 shows the eight possible cuts for this model.

(i)

\[\begin{split}\text{(a)} &\colon U_a(0) + U_b(0) + U_c(0) = 12.7\\ \text{(b)} &\colon U_a(0) + U_b(0) + U_c(1) + P_{bc}(0,1) = 12.9\\ \text{(c)} &\colon U_a(0) + U_b(1) + U_c(0) + P_{ab}(0,1) + P_{bc}(1,1) = 23.1\\ \text{(d)} &\colon U_a(0) + U_b(1) + U_c(1) + P_{ab}(0,1) = 15.2\\ \text{(e)} &\colon U_a(1) + U_b(0) + U_c(0) + P_{ab}(1,0) = 21\\ \text{(f)} &\colon U_a(1) + U_b(0) + U_c(1) + P_{ab}(1,0) + P_{bc}(0,1) = 21.2\\ \text{(g)} &\colon U_a(1) + U_b(1) + U_c(0) + P_{bc}(1,0) = 26.2\\ \text{(h)} &\colon U_a(1) + U_b(1) + U_c(1) = 18.3\end{split}\]

The minimum cut is (a).

(ii)

Running Ford-Fulkerson on the graph yields the same value for maximum flow i.e. maximum flow of a graph equals capacity of minimum cut.

Exercise 12.3

The following are based on Figure 12.10.

\((a = 0, b = 0)\)

\[U_a(0) + U_b(0) + P_{ab}(0,0)\]

\((a = 0, b = 1)\)

\[U_a(0) + U_b(1) + P_{ab}(0,1)\]

\((a = 1, b = 0)\)

\[\left[ U_a(1) + P_{ab}(1,1) \right] + \left[ U_b(0) + P_{ab}(0,0) \right] + \left[ P_{ab}(1,0) - P_{ab}(1,1) - P_{ab}(0,0) \right] = U_a(1) + U_b(0) + P_{ab}(1,0)\]

\((a = 1, b = 1)\)

\[U_a(1) + U_b(1) + P_{ab}(1,1)\]

Exercise 12.4

The following are based on Figure 12.11c.

\((a = 0, b = 0)\)

\[U_a(0) + U_b(0) + \beta\]

\((a = 0, b = 1)\)

\[U_a(0) + U_b(1) + P_{ab}(0,1) + \beta\]

\((a = 1, b = 0)\)

\[\left[ U_a(1) + \beta \right] + \left[ U_b(0) + \beta \right] + \left[ P_{ab}(1,0) - \beta \right] = U_a(1) + U_b(0) + P_{ab}(1,0) + \beta\]

\((a = 1, b = 1)\)

\[U_a(1) + U_b(1) + \beta\]

Exercise 12.5

Let \(C[w_a, w_b]\) denote the minimum cut capacity where \(w_a, w_b \in \{ 1, 2, 3, 4, 5 \}\).

\[\begin{split}C[1,1] &= U_a(1) + U_b(1)\\ C[1,2] &= U_a(1) + U_b(2) + P_{ab}(1,2)\\ C[1,3] &= U_a(1) + U_b(3) + P_{ab}(1,2) + P_{ab}(2,3) + \infty = \infty\\ C[1,4] &= U_a(1) + U_b(4) + P_{ab}(1,2) + P_{ab}(2,3) + P_{ab}(3,4) + (2) \infty = \infty\\ C[1,4] &= U_a(1) + U_b(4) + P_{ab}(1,2) + P_{ab}(2,3) + P_{ab}(3,4) + P_{ab}(4,5) + (3) \infty = \infty\\\\\\ C[2,1] &= U_a(2) + U_b(1) + P_{ab}(2,1)\\ C[2,2] &= U_a(2) + U_b(2)\\ C[2,3] &= U_a(2) + U_b(3) + P_{ab}(2,3)\\ C[2,4] &= U_a(2) + U_b(4) + P_{ab}(2,3) + P_{ab}(3,4) + \infty = \infty\\ C[2,5] &= U_a(2) + U_b(5) + P_{ab}(2,3) + P_{ab}(3,4) + P_{ab}(4,5) + \infty = \infty\\\\\\ C[3,1] &= U_a(3) + U_b(1) + P_{ab}(2,1) + P_{ab}(3,2) + \infty = \infty\\ C[3,2] &= U_a(3) + U_b(2) + P_{ab}(3,2)\\ C[3,3] &= U_a(3) + U_b(3)\\ C[3,4] &= U_a(3) + U_b(4) + P_{ab}(3,4)\\ C[3,5] &= U_a(3) + U_b(5) + P_{ab}(3,4) + P_{ab}(4,5) + \infty = \infty\\\\\\ C[4,1] &= U_a(4) + U_b(1) + P_{ab}(2,1) + P_{ab}(3,2) + P_{ab}(4,3) + (2) \infty = \infty\\ C[4,2] &= U_a(4) + U_b(2) + P_{ab}(3,2) + P_{ab}(4,3) + \infty = \infty\\ C[4,3] &= U_a(4) + U_b(3) + P_{ab}(4,3)\\ C[4,4] &= U_a(4) + U_b(4)\\ C[4,5] &= U_a(4) + U_b(5) + P_{ab}(4,5)\\\\\\ C[5,1] &= U_a(5) + U_b(1) + P_{ab}(2,1) + P_{ab}(3,2) + P_{ab}(4,3) + P_{ab}(5,4) + (3) \infty = \infty\\ C[5,2] &= U_a(5) + U_b(2) + P_{ab}(3,2) + P_{ab}(4,3) + P_{ab}(5,4) + (2) \infty = \infty\\ C[5,3] &= U_a(5) + U_b(3) + P_{ab}(4,3) + P_{ab}(5,4) + \infty = \infty\\ C[5,4] &= U_a(5) + U_b(4) + P_{ab}(5,4)\\ C[5,5] &= U_a(5) + U_b(5)\end{split}\]

\(C[w_a, w_b]\) is finite when \(\left\vert w_a - w_b \right\vert \leq 1\).

Exercise 12.6

Adding an additional pixel \(c\) will only increase the capacity of the minimum cut. This means the cuts from Exercise 12.5 that have infinite cost will still be categorized as infinite with the additional capacity. Hence the \(5^3\) possible minimum cuts are reduced to \(5 (2 + 3 + 3 + 3 + 2) = 65\).

The possible combinations are further reduced from the observation that the edges with infinite capacity are fully connected to the previous layer i.e. \(C[w_a, w_b, w_c]\) is finite when

\[\left\vert w_a - w_b \right\vert \leq 1 \qquad \land \qquad \left\vert w_b - w_c \right\vert \leq 1 \qquad \land \qquad \left\vert w_c - w_a \right\vert \leq 1.\]

Exercise 12.7

Example 12.14 consists of two pixels with pairwise connections that can take on \(K = 4\) different labels.

\[\begin{split}U_a(4) + U_b(4) + \sum_{i = 1}^4 \sum_{j = 5}5 C_{ab}(i,j) &= U_a(4) + U_b(4) + \left( P_{ab}(1,4) + P_{ab}(0,5) - P_{ab}(1,5) - P_{ab}(0,4) \right) +\\ &\quad \left( P_{ab}(2,4) + P_{ab}(1,5) - P_{ab}(2,5) - P_{ab}(1,4) \right) +\\ &\quad \left( P_{ab}(3,4) + P_{ab}(2,5) - P_{ab}(3,5) - P_{ab}(2,4) \right) +\\ &\quad \left( P_{ab}(4,4) + P_{ab}(3,5) - P_{ab}(4,5) - P_{ab}(3,4) \right) & \quad & \text{(12.17)}\\ &= U_a(4) + U_b(4) + \left( P_{ab}(1,4) \right) + \left( P_{ab}(2,4) - P_{ab}(1,4) \right) +\\ &\quad \left( P_{ab}(3,4) - P_{ab}(2,4) \right) + \left( P_{ab}(4,4) - P_{ab}(3,4) \right) & \quad & \text{(12.18)}\\ &= U_a(4) + U_b(4) + P_{ab}(4,4) & \quad & \text{(12.19)}\end{split}\]

Exercise 12.8

The Potts model is \(P_{mn}(w_m, w_n) = \kappa (1 - \boldsymbol{\delta}(w_m, w_n))\). The generalized multi-label submodularity condition for the Potts model is

\[\begin{split}& P_{ab}(\beta, \gamma) + P_{ab}(\alpha, \delta) - P_{ab}(\beta, \delta) - P_{ab}(\alpha, \gamma) & \quad & \text{(12.21)}\\ &= \kappa (1 - \boldsymbol{\delta}(\beta, \gamma)) + \kappa (1 - \boldsymbol{\delta}(\alpha, \delta)) - \kappa (1 - \boldsymbol{\delta}(\beta, \delta)) - \kappa (1 - \boldsymbol{\delta}(\alpha, \gamma))\\ &= \kappa \left( -\boldsymbol{\delta}(\beta, \gamma) - \boldsymbol{\delta}(\alpha, \delta) + \boldsymbol{\delta}(\beta, \delta) + \boldsymbol{\delta}(\alpha, \gamma) \right)\end{split}\]

where \(\beta > \alpha\) and \(\delta > \gamma\). This condition is no longer non-negative when \(\delta = \alpha\), which results in \(\beta > \delta > \gamma\).

Exercise 12.9

The following explains the relationship of the nodes in Figure 12.3’s depiction of the \(\alpha-\beta\) graphical model. Note that this graphical model formulation does not invoke the triangle inequality.

Each vertex is connected to the source (representing the label \(\alpha\)) and sink (representing the label \(\beta\)).

Source and Sink Edge Weights

The edges connected to the source and sink have a unary cost of \(U_{\cdot}(\alpha)\) and \(U_{\cdot}(\beta)\) respectively where \(\cdot \in \{ a, b, c, d \}\).

Pixels \(i\) and \(j\) have labels \(\{ (\gamma, \gamma) \}\)

Since \(e\) has a label of \(\gamma\) and the current swap is for \((\alpha, \beta)\), there are no edges connecting to pixel node \(e\). Consequently, nodes which are neither \(\alpha\) nor \(\beta\) do not need to be included in the graph.

Pixels \(i\) and \(j\) have labels \(\{ (\alpha, \alpha), (\alpha, \beta), (\beta, \beta) \}\)

The final configuration consists of \(\alpha - \alpha\) with zero pairwise cost, \(\alpha - \beta\) with pairwise cost \(P_{ij}(\alpha, \beta)\), \(\beta - \alpha\) with pairwise cost \(P_{ij}(\beta, \alpha)\), and \(\beta - \beta\) with zero pairwise cost.

Pixels \(i\) and \(j\) have labels \(\{ (\alpha, \gamma), (\beta, \gamma) \}\)

The final configuration consists of \(\alpha - \gamma\) with pairwise cost \(P_{ij}(\alpha, \gamma)\), and \(\beta - \gamma\) with pairwise cost \(P_{ij}(\beta, \gamma)\). If these costs were added as edges to pixel node \(j\), it will never be part of the minimum cut. The only way to include this cost is to add them to the source and sink edges of pixel node \(i\) respectively.