Cookbook model averaging question

Valery_Topinsky · August 8, 2017, 12:10am

Hello.

I've got a question about the following example from cookbook
https://graph-tool.skewed.de/static/doc/demos/inference/inference.html#id11

I work on my own network in exact same way, trying to perform sampling to
estimate
some metrics. But the results are in some way replicates the behaviour from
the cookbook example:
For both cases (simple and nested SBM) the marginal distributions for
vertices most of the times has too many non-zero values for different
clusters, hence the colouring is so fine granular. Only few (1-2) clusters
obey some explicit dominant group membership. But the rest of clusters
exhibit very distributed marginals.
Do you have any explanation for this?
In case of my network I also have only 1-3 groups of nodes with some
explicit dominant group membership. And the rest of vertices has too many
non-zero, almost uniformly distributed marginals. I was thinking that for
the simple cookbook example it's not natural that some vertices has more
than 10 non-zero marginal values.
May be it's just the result of independent launches of mcmc algorithm and
random nature of groups labelling? Or there is some intuition behind this
high marginal variance in group membership?
I launched several times the optimisation, and drew the results.
Topologically the outputs were very close to each other, although colouring
was always different except a few kind of "stable" vertices. Hence, I
guess, the resulted marginals for them have the same properties. But labels
are not informative it selves. May be there is some trick how to force
some deterministic labelling policies to stabilise it ?

Thank you
Valery.

attachment.html (1.93 KB)

tiago · August 8, 2017, 1:17pm

I work on my own network in exact same way, trying to perform sampling to
estimate
some metrics. But the results are in some way replicates the behaviour from
the cookbook example:
For both cases (simple and nested SBM) the marginal distributions for
vertices most of the times has too many non-zero values for different
clusters, hence the colouring is so fine granular. Only few (1-2) clusters
obey some explicit dominant group membership. But the rest of clusters
exhibit very distributed marginals.
Do you have any explanation for this?

This means that the posterior distribution is broad, i.e. not concentrated
on any particular distribution. This implies either that the model is
mispecified, i.e. your network does not have well-defined groups, or that it
is very noisy.

In case of my network I also have only 1-3 groups of nodes with some
explicit dominant group membership. And the rest of vertices has too many
non-zero, almost uniformly distributed marginals. I was thinking that for
the simple cookbook example it's not natural that some vertices has more
than 10 non-zero marginal values.
May be it's just the result of independent launches of mcmc algorithm and
random nature of groups labelling? Or there is some intuition behind this
high marginal variance in group membership?
I launched several times the optimisation, and drew the results.
Topologically the outputs were very close to each other, although colouring
was always different except a few kind of "stable" vertices. Hence, I guess,
the resulted marginals for them have the same properties. But labels are not
informative it selves. May be there is some trick how to force some
deterministic labelling policies to stabilise it ?

There is no trick; this variance in the posterior reflects the nature of
your data. You if you want a single partition to represent it, you have to
choose between two extremes of the bias-variance trade-off:

1. Choose the most likely partition, i.e. the one that minimizes the
description length. (more bias, less variance)

   2. Choose the maximum a posteriori estimate for each node, i.e., the
      most likely node label according to the node marginals. (less bias,
      more variance)

Option 2 averages over the noise, but might not be representative of any
particular fit (specially if the number of groups is fluctuating). Option 1
usually underfits, but may also overfit, depending on your data.

There is a discussion on this here: [1705.10225] Bayesian stochastic blockmodeling

Best,
Tiago

Valery_Topinsky · August 9, 2017, 5:04pm

Good day,
Thank you for the reply.

I want to demonstrate the relevant confusing observation.
I ran the example from cookbook:

I attached the plots. As you can see the model always use a few (nonempty)
blocks from 6 to 9.
But at the same time amount of different marginal states (with positive
probabilities)
for some vertices are around 70 (almost the all potential 77 =
g.num_vertices()).
Which means that during independent runs model can get new set of 6 to 9
blocks but
just with some other labels of it. This is what I meant by:
"May be it's just the result of independent launches of mcmc algorithm and
random nature of groups labelling?"

Is there any way how to do sampling without specifying exact B?
But rather with sampling of B as it described in
https://arxiv.org/pdf/1705.10225.pdf Ch. IV. ?

<http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/file/n4027334/lesmis-1.png>
<http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/file/n4027334/lesmis-2.png>

tiago · August 9, 2017, 5:16pm

I attached the plots. As you can see the model always use a few (nonempty)
blocks from 6 to 9.
But at the same time amount of different marginal states (with positive
probabilities)
for some vertices are around 70 (almost the all potential 77 =
g.num_vertices()).
Which means that during independent runs model can get new set of 6 to 9
blocks but
just with some other labels of it. This is what I meant by:
"May be it's just the result of independent launches of mcmc algorithm and
random nature of groups labelling?"

Oh, the actual vertex labels are not meaningful. You can just re-label them
in a contiguous range before computing the histogram.

Is there any way how to do sampling without specifying exact B?
But rather with sampling of B as it described in
https://arxiv.org/pdf/1705.10225.pdf Ch. IV. ?

This is exactly what happens; this is why your histogram has many values of
non-empty groups.

(The number of total groups, including empty ones, will always grow as
necessary.)

Best,
Tiago