Discrepancy between histograms and final number of groups

dawe · February 6, 2020, 11:09am

Hi, I'm running the nested version of nSBM, I'm collecting the group
marginals using the code from gt documentation, basically counting the
number of non empty blocks for each hierarchy level for each iteration:

group_marginals = [np.zeros(g.num_vertices() + 1) for s in
state.get_levels()]
def _collect_marginals(s):
   levels = s.get_levels()
   for l, sl in enumerate(levels):
       group_marginals[l][sl.get_nonempty_B()] += 1
   […]

At the end of the equilibration I look at the distributions and, in general,
the most probable number of blocks at each level is not the one that is
stored in the final state, although the final number of blocks is typically
the second most probable. I may be naive, but I expected the two to be the
same.

d

tiago · February 6, 2020, 1:13pm

There is no guarantee that the mode of a distribution needs to be equal
to the mean.

Indeed, posterior averages often diverge from point estimates with the
maximum likelihood. I talk about this in this paper (look at Fig 6 which
shows exactly what you see): https://arxiv.org/abs/1610.02703

Best,
Tiago

dawe · February 6, 2020, 1:55pm

Hello,

There is no guarantee that the mode of a distribution needs to be equal
to the mean.

Ok, now I see

Indeed, posterior averages often diverge from point estimates with the
maximum likelihood. I talk about this in this paper (look at Fig 6 which
shows exactly what you see): [1610.02703] Nonparametric Bayesian inference of the microcanonical stochastic block model

I'm going to read it in more detail (can I jump directly to section VII?),
Thank you

d