Models for Grids
Markov Random Fields
A Markov random field is essentially an undirected graphical model that
describes the joint probability of the variables (12.2)
\[Pr(\mathbf{w}) =
Z^{-1} \prod_{j = 1}^J \phi_j\left[ \mathbf{w}_{\mathcal{C}_j} \right].\]
It is typically solved with graph cut via max flow.
Conditional Random Fields
Conditional random fields is a discriminative model for the joint probability
distribution \(Pr(\mathbf{w}, \mathbf{x})\) of undirected graphical models
(12.23). It is also solved via graph cut.
Higher Order Models
Instead of treating each variable \(w_n\) as a pixel, they can instead
represent a patch. This is less likely to be submodular or even obey the
triangle inequality. The number of \(K\) patches may be very inefficient,
and optimizing the resulting cost function is hard.
Directed Models for Grids
Instead of using the original undirected graphical model, use an approximate
directed graphical model. Unfortunately, there is no known polynomial algorithm
to optimize resulting cost function that contains three-wise terms. Approximate
inference methods would sample from the model.
Exercise 12.1
Let \(\phi[a, b] = \exp\left[ -(a - b)^2 \right]\) and
\(\mathbf{x} = \begin{bmatrix} x_1 & x_2 & x_3 & x_4 \end{bmatrix}^\top\).
\[\begin{split}Pr(x_1, x_2, x_3, x_4)
&= \frac{1}{Z}
\phi[x_1, x_2] \phi[x_2, x_3] \phi[x_3, x_4] \phi[x_4, x_1]
& \quad & \text{(10.9)}\\
&= \frac{1}{Z} \exp\left[
-(x_1 - x_2)^2 - (x_2 - x_3)^2 - (x_3 - x_4)^2 - (x_4 - x_1)^2
\right]\\
&= \frac{1}{Z} \exp\left[
(x_1^2 - 2 x_1 x_2 + x_2^2) + (x_2^2 - 2 x_2 x_3 + x_3^2) +
(x_3^2 - 2 x_3 x_4 + x_4^2) + (x_4^2 - 2 x_4 x_1 + x_1^2)
\right]^{-1}\\
&= \frac{1}{Z} \exp\left[
2 x_1^2 - 2 x_1 x_2 - 2 x_1 x_4 + 2 x_2^2 - 2 x_2 x_3 + 2 x_3^2 -
2 x_3 x_4 + 2 x_4^2
\right]\\
&= \frac{1}{Z} \exp\left[
\mathbf{x}^\top \boldsymbol{\Sigma}^{-1} \mathbf{x}
\right]\end{split}\]
where the inverse of the covariance matrix is
\[\begin{split}\boldsymbol{\Sigma}^{-1} =
\begin{bmatrix}
2 & -1 & 0 & -1\\
-1 & 2 & -1 & 0\\
0 & -1 & 2 & -1\\
-1 & 0 & -1 & 2
\end{bmatrix}.\end{split}\]
Exercise 12.2
Figure 12.9 shows the eight possible cuts for this model.
(i)
\[\begin{split}\text{(a)} &\colon U_a(0) + U_b(0) + U_c(0) = 12.7\\
\text{(b)} &\colon U_a(0) + U_b(0) + U_c(1) + P_{bc}(0,1) = 12.9\\
\text{(c)}
&\colon U_a(0) + U_b(1) + U_c(0) + P_{ab}(0,1) + P_{bc}(1,1) = 23.1\\
\text{(d)} &\colon U_a(0) + U_b(1) + U_c(1) + P_{ab}(0,1) = 15.2\\
\text{(e)} &\colon U_a(1) + U_b(0) + U_c(0) + P_{ab}(1,0) = 21\\
\text{(f)}
&\colon U_a(1) + U_b(0) + U_c(1) + P_{ab}(1,0) + P_{bc}(0,1) = 21.2\\
\text{(g)} &\colon U_a(1) + U_b(1) + U_c(0) + P_{bc}(1,0) = 26.2\\
\text{(h)} &\colon U_a(1) + U_b(1) + U_c(1) = 18.3\end{split}\]
The minimum cut is (a).
(ii)
Running Ford-Fulkerson on the graph yields the same value for maximum flow i.e.
maximum flow of a graph equals capacity of minimum cut.
Exercise 12.3
The following are based on Figure 12.10.
\((a = 0, b = 0)\)
\[U_a(0) + U_b(0) + P_{ab}(0,0)\]
\((a = 0, b = 1)\)
\[U_a(0) + U_b(1) + P_{ab}(0,1)\]
\((a = 1, b = 0)\)
\[\left[ U_a(1) + P_{ab}(1,1) \right] + \left[ U_b(0) + P_{ab}(0,0) \right] +
\left[ P_{ab}(1,0) - P_{ab}(1,1) - P_{ab}(0,0) \right] =
U_a(1) + U_b(0) + P_{ab}(1,0)\]
\((a = 1, b = 1)\)
\[U_a(1) + U_b(1) + P_{ab}(1,1)\]
Exercise 12.4
The following are based on Figure 12.11c.
\((a = 0, b = 0)\)
\[U_a(0) + U_b(0) + \beta\]
\((a = 0, b = 1)\)
\[U_a(0) + U_b(1) + P_{ab}(0,1) + \beta\]
\((a = 1, b = 0)\)
\[\left[ U_a(1) + \beta \right] + \left[ U_b(0) + \beta \right] +
\left[ P_{ab}(1,0) - \beta \right] =
U_a(1) + U_b(0) + P_{ab}(1,0) + \beta\]
\((a = 1, b = 1)\)
\[U_a(1) + U_b(1) + \beta\]
Exercise 12.5
Let \(C[w_a, w_b]\) denote the minimum cut capacity where
\(w_a, w_b \in \{ 1, 2, 3, 4, 5 \}\).
\[\begin{split}C[1,1] &= U_a(1) + U_b(1)\\
C[1,2] &= U_a(1) + U_b(2) + P_{ab}(1,2)\\
C[1,3] &= U_a(1) + U_b(3) + P_{ab}(1,2) + P_{ab}(2,3) + \infty = \infty\\
C[1,4] &= U_a(1) + U_b(4) + P_{ab}(1,2) + P_{ab}(2,3) + P_{ab}(3,4) +
(2) \infty = \infty\\
C[1,4] &= U_a(1) + U_b(4) + P_{ab}(1,2) + P_{ab}(2,3) + P_{ab}(3,4) +
P_{ab}(4,5) + (3) \infty = \infty\\\\\\
C[2,1] &= U_a(2) + U_b(1) + P_{ab}(2,1)\\
C[2,2] &= U_a(2) + U_b(2)\\
C[2,3] &= U_a(2) + U_b(3) + P_{ab}(2,3)\\
C[2,4] &= U_a(2) + U_b(4) + P_{ab}(2,3) + P_{ab}(3,4) + \infty = \infty\\
C[2,5] &= U_a(2) + U_b(5) + P_{ab}(2,3) + P_{ab}(3,4) + P_{ab}(4,5) + \infty
= \infty\\\\\\
C[3,1] &= U_a(3) + U_b(1) + P_{ab}(2,1) + P_{ab}(3,2) + \infty = \infty\\
C[3,2] &= U_a(3) + U_b(2) + P_{ab}(3,2)\\
C[3,3] &= U_a(3) + U_b(3)\\
C[3,4] &= U_a(3) + U_b(4) + P_{ab}(3,4)\\
C[3,5] &= U_a(3) + U_b(5) + P_{ab}(3,4) + P_{ab}(4,5) + \infty = \infty\\\\\\
C[4,1] &= U_a(4) + U_b(1) + P_{ab}(2,1) + P_{ab}(3,2) + P_{ab}(4,3) +
(2) \infty = \infty\\
C[4,2] &= U_a(4) + U_b(2) + P_{ab}(3,2) + P_{ab}(4,3) + \infty = \infty\\
C[4,3] &= U_a(4) + U_b(3) + P_{ab}(4,3)\\
C[4,4] &= U_a(4) + U_b(4)\\
C[4,5] &= U_a(4) + U_b(5) + P_{ab}(4,5)\\\\\\
C[5,1] &= U_a(5) + U_b(1) + P_{ab}(2,1) + P_{ab}(3,2) + P_{ab}(4,3) +
P_{ab}(5,4) + (3) \infty = \infty\\
C[5,2] &= U_a(5) + U_b(2) + P_{ab}(3,2) + P_{ab}(4,3) + P_{ab}(5,4) +
(2) \infty = \infty\\
C[5,3] &= U_a(5) + U_b(3) + P_{ab}(4,3) + P_{ab}(5,4) + \infty = \infty\\
C[5,4] &= U_a(5) + U_b(4) + P_{ab}(5,4)\\
C[5,5] &= U_a(5) + U_b(5)\end{split}\]
\(C[w_a, w_b]\) is finite when
\(\left\vert w_a - w_b \right\vert \leq 1\).
Exercise 12.6
Adding an additional pixel \(c\) will only increase the capacity of the
minimum cut. This means the cuts from
Exercise 12.5 that have infinite cost will
still be categorized as infinite with the additional capacity. Hence the
\(5^3\) possible minimum cuts are reduced to
\(5 (2 + 3 + 3 + 3 + 2) = 65\).
The possible combinations are further reduced from the observation that the
edges with infinite capacity are fully connected to the previous layer i.e.
\(C[w_a, w_b, w_c]\) is finite when
\[\left\vert w_a - w_b \right\vert \leq 1
\qquad \land \qquad
\left\vert w_b - w_c \right\vert \leq 1
\qquad \land \qquad
\left\vert w_c - w_a \right\vert \leq 1.\]
Exercise 12.7
Example 12.14 consists of two pixels with pairwise connections that can take
on \(K = 4\) different labels.
\[\begin{split}U_a(4) + U_b(4) + \sum_{i = 1}^4 \sum_{j = 5}5 C_{ab}(i,j)
&= U_a(4) + U_b(4) +
\left( P_{ab}(1,4) + P_{ab}(0,5) - P_{ab}(1,5) - P_{ab}(0,4) \right) +\\
&\quad
\left( P_{ab}(2,4) + P_{ab}(1,5) - P_{ab}(2,5) - P_{ab}(1,4) \right) +\\
&\quad
\left( P_{ab}(3,4) + P_{ab}(2,5) - P_{ab}(3,5) - P_{ab}(2,4) \right) +\\
&\quad
\left( P_{ab}(4,4) + P_{ab}(3,5) - P_{ab}(4,5) - P_{ab}(3,4) \right)
& \quad & \text{(12.17)}\\
&= U_a(4) + U_b(4) +
\left( P_{ab}(1,4) \right) + \left( P_{ab}(2,4) - P_{ab}(1,4) \right) +\\
&\quad
\left( P_{ab}(3,4) - P_{ab}(2,4) \right) +
\left( P_{ab}(4,4) - P_{ab}(3,4) \right)
& \quad & \text{(12.18)}\\
&= U_a(4) + U_b(4) + P_{ab}(4,4)
& \quad & \text{(12.19)}\end{split}\]
Exercise 12.8
The Potts model is
\(P_{mn}(w_m, w_n) = \kappa (1 - \boldsymbol{\delta}(w_m, w_n))\). The
generalized multi-label submodularity condition for the Potts model is
\[\begin{split}& P_{ab}(\beta, \gamma) + P_{ab}(\alpha, \delta) -
P_{ab}(\beta, \delta) - P_{ab}(\alpha, \gamma)
& \quad & \text{(12.21)}\\
&= \kappa (1 - \boldsymbol{\delta}(\beta, \gamma)) +
\kappa (1 - \boldsymbol{\delta}(\alpha, \delta)) -
\kappa (1 - \boldsymbol{\delta}(\beta, \delta)) -
\kappa (1 - \boldsymbol{\delta}(\alpha, \gamma))\\
&= \kappa \left(
-\boldsymbol{\delta}(\beta, \gamma) -
\boldsymbol{\delta}(\alpha, \delta) +
\boldsymbol{\delta}(\beta, \delta) +
\boldsymbol{\delta}(\alpha, \gamma)
\right)\end{split}\]
where \(\beta > \alpha\) and \(\delta > \gamma\). This condition is no
longer non-negative when \(\delta = \alpha\), which results in
\(\beta > \delta > \gamma\).
Exercise 12.9
The following explains the relationship of the nodes in Figure 12.3’s depiction
of the \(\alpha-\beta\) graphical model. Note that this graphical model
formulation does not invoke the triangle inequality.
Each vertex is connected to the source (representing the label \(\alpha\))
and sink (representing the label \(\beta\)).
Source and Sink Edge Weights
The edges connected to the source and sink have a unary cost of
\(U_{\cdot}(\alpha)\) and \(U_{\cdot}(\beta)\) respectively where
\(\cdot \in \{ a, b, c, d \}\).
Pixels \(i\) and \(j\) have labels \(\{ (\gamma, \gamma) \}\)
Since \(e\) has a label of \(\gamma\) and the current swap is for
\((\alpha, \beta)\), there are no edges connecting to pixel node \(e\).
Consequently, nodes which are neither \(\alpha\) nor \(\beta\) do not
need to be included in the graph.
Pixels \(i\) and \(j\) have labels \(\{ (\alpha, \alpha), (\alpha, \beta), (\beta, \beta) \}\)
The final configuration consists of \(\alpha - \alpha\) with zero pairwise
cost, \(\alpha - \beta\) with pairwise cost \(P_{ij}(\alpha, \beta)\),
\(\beta - \alpha\) with pairwise cost \(P_{ij}(\beta, \alpha)\), and
\(\beta - \beta\) with zero pairwise cost.
Pixels \(i\) and \(j\) have labels \(\{ (\alpha, \gamma), (\beta, \gamma) \}\)
The final configuration consists of \(\alpha - \gamma\) with pairwise cost
\(P_{ij}(\alpha, \gamma)\), and \(\beta - \gamma\) with pairwise cost
\(P_{ij}(\beta, \gamma)\). If these costs were added as edges to pixel node
\(j\), it will never be part of the minimum cut. The only way to include
this cost is to add them to the source and sink edges of pixel node \(i\)
respectively.