Saturday, August 25, 2012

On convergence of Gaussian BP

I got the following question from Gregor Gorjanc, our man in Lubliyana:


Dear Danny,

We have spend quite some time trying out several things to use GaBP with the models we work with. Unfortunatelly, we can never make it work without regularization. It seems that GaBP is in particular sensitive to models where graphs have V like DAG structures, i.e., some nodes having two parental nodes. For example we work with the model describing genetic inheritance and in essence we model this process between a trio of two parents and a child. In what follows I will describe the model (just a tiny example) and show you some examples that appear to give some insight with which types of model GaBP is having problems.

Say we have two parents (1 and 2) and their children (3 and 4). For simplicity assume we observe phenotypic value (y) on all four individuals. The aim of the model is to infer genetic value (g) for each individual based on phenotypic values and pedigree links. The generic model (I have skipped some parts) I work with is:

y_i = g_i + e_i
g_i = 1/2 * g_f(i) + 1/2 * g_m(i) + w_i

e_i ~ N(0, 4)
w_i ~ N(0, 2)

where:

y_i = phenotypic value on individual i
g_i = genetic    value of individual i
e_i = phenotypic residual

g_f(i) = genetic    value of father of individual i
g_m(i) = genetic    value of mother of individual i
w_i    = genetic residual

The full model graph for this family (see attached files namedfamily_full*) would be:

g_1 = w_1
g_2 = w_2
g_3 = 1/2 * g_1 + 1/2 * g_2 + w_3
g_4 = 1/2 * g_1 + 1/2 * g_2 + w_4

y_1 = g_1 + e_1
y_2 = g_2 + e_2
y_3 = g_3 + e_3
y_4 = g_4 + e_4

With the LHS of a system of equations equal to:

## [1,]  5  2 -2 -2
## [2,]  2  5 -2 -2
## [3,] -2 -2  5  .
## [4,] -2 -2  .  5
isDiagDom(x=LHS)
## $isDiagDomMatrix
## [1] FALSE
spectralRadius(x=LHS)
## $spectralRadius
## [1] 1.02
## $spectralRadiusSmallerThan1
## [1] FALSE

This very simple model does not converge. So I attempted to build smaller models to understand what is going on. The first attempt was to take one parent and one child. This model (see attached files named family_one-one*) is now:

g_1 = w_1
g_3 = 1/2 * g_1 + w_3

y_1 = g_1 + e_1
y_3 = g_3 + e_3

With the LHS of a system of equations equal to:

## [1,]  3.67 -1.33
## [2,] -1.33 3.67
isDiagDom(x=LHS)
## $isDiagDomMatrix
## [1] TRUE
spectralRadius(x=LHS)
## $spectralRadius
## [1] 0.36
## $spectralRadiusSmallerThan1
## [1] TRUE

Since spectral radius is smaller than 1 this model converges to exact marginal means as well as variances! Now I added another child node (note we have now A like structure of a DAG) so the model (see attached files named family_one-two*) is now

g_1 = w_1
g_3 = 1/2 * g_1 + w_3
g_4 = 1/2 * g_1 + w_4

y_1 = g_1 + e_1
y_3 = g_3 + e_3
y_4 = g_4 + e_4

With the LHS of a system of equations equal to:

## [1,]  4.333333 -1.333333 -1.333333
## [2,] -1.333333  3.666667  .       
## [3,] -1.333333  .         3.666667
isDiagDom(x=LHS)
## $isDiagDomMatrix
## [1] TRUE
spectralRadius(x=LHS)
## $spectralRadius
## [1] 0.47
## $spectralRadiusSmallerThan1
## [1] TRUE

Which again converges to exact marginal means as well as variances. Now I created a model with two parents and one child (note we have now V like structure of a DAG) so the model (see attached files named family_two-one*) is: 

g_1 = w_1
g_2 = w_2
g_3 = 1/2 * g_1 + 1/2 * g_2 + w_3

y_1 = g_1 + e_1
y_2 = g_2 + e_2
y_3 = g_3 + e_3

With the LHS of a system of equations equal to:

## [1,]  4  1 -2
## [2,]  1  4 -2
## [3,] -2 -2  5
isDiagDom(x=LHS)
## $isDiagDomMatrix
## [1] TRUE
spectralRadius(x=LHS)
## $spectralRadius
## [1] 0.77
## $spectralRadiusSmallerThan1
## [1] TRUE

This also converges to exact marginal means but not for variances.

From the above tests it seems that GaBP:
- underestimates variances when V like structures appear in the model (example family_two-one) --> in the Markov network this structure is a small loop between the three involved nodes

- fails to converge when there are overlapping loops in Markov network, i.e., due to two V structures sharing parental nodes (example family_full)

Is seems that this shows why GaBP does not work with our applications as above structures are highly abundant in our models. Do you see any value in these observations?

Attached is a set of files for each example and a small shell script that show how GaBP from GraphLab was being used - only basic options were used deliebrately to understand what was going on.

./GaBP.sh family_one-one 0 0 > family_one-one.log 2>family_one-one.screen
./GaBP.sh family_one-two 0 0 > family_one-two.log 2>family_one-two.screen
./GaBP.sh family_two-one 0 0 > family_two-one.log 2>family_two-one.screen
./GaBP.sh family_full    0 0 > family_full.log    2>family_full.screen

Regards, Gregor

My Answer:
Hi Gregor, 
It seems you reverse engineered the known convergence conditions of Gaussian BP (and generally BP). 
1) For a graph which is a tree, BP converges to the exact result, namely both the means and the variances are exact.
2) A known result by Yair Weiss (Y. Weiss. Correctness of local probability propagation in graphical models with loops. Neural Computation, 12(1), 2000.) is that for graphs with a single cycle, BP always converges to a unique fixed point. In this case the mean is exact.
3) When the algorithm converges (for a graph with cycles), the mean is exact, but the variance is an underestimation of the true variance. (Weiss, Yair (2000). "Correctness of Local Probability Propagation in Graphical Models with Loops". Neural Computation 12 (1): 1–41.) 
See also the great explanation by Jason K. Johnson:  Malioutov, Johnson, Willsky. Walk-sums and belief propagation in Gaussian graphical models, Journal of Machine Learning Research, vol. 7, pp. 2031-2064, October 2006.

I believe all your conclusions are correct, maybe except for the two V structures, where the algorithm may converge depends on the matrix values...

No comments:

Post a Comment