Understanding the entropy of edge counts at each hSBM layer

Dear Tiago,

I am trying to understand each value of NestedBlockState.level_entropy that
contributes to the entropy of a hierarchical SBM (hSBM). In particular, I am
focusing on the “prior for the edge counts“. In PRE 95, 012317 (2017), we
have used the multigraph entropy for _all levels_ of the hSBM because of the
dominance of parallel edges at higher levels. Nevertheless, for each level
(l=k; 1 <= k < L), we assume its connectivity pattern is generated by some
NDC-SBM at a higher level (l=k+1), rendering the sum of the entropy terms of
all levels smaller than that of a flat prior, avoiding under-fitting.

There will be 2 questions along with the thread.

To see what arguments BlockState.entropy() uses, I added,

print("eargs.###: {}".format(eargs.###))

before the return line of the entropy() function in the
graph_tool/inference/blockmodel.py file, where ### can be _degree_dl_,
_edges_dl_, and _multigraph_, etc.

Taking the “lesmis” dataset as an example, if we run
minimize_nested_blockmodel_dl() on it, we (may) obtain two levels, i.e.
[(77, 6), (6, 1)]. Now using this commands,

nested_state.level_entropy(0)

which outputs:
eargs.dense: False
eargs.edges_dl: False
eargs.multigraph: True
Out[•]: 630.133156768878

And, with this command,

nested_state.level_entropy(1)

It outputs:
eargs.dense: True
eargs.edges_dl: True
eargs.multigraph: True
Out[•]: 71.0082133080805

Which constitute the two levels of entropies that sum up to
nested_state.entropy().

Here is my 1st question:

:face_with_monocle: Why is `edges_dl` excluded except the highest level?

I expected `edges_dl` at the lowest to be something nonzero, but _less_ than
`69.21645383885243`, i.e. the exact negative logarithm of Eq. (40) of the
PRE paper, using B=6 and e=254 (the number of edges of the “lesmis”
dataset). Am I thinking it the right way? :face_with_monocle: <— this is the 2nd question.

Sincerely thanks,
Tzu-Chi

:face_with_monocle: Why is `edges_dl` excluded except the highest level?

The `edges_dl` parameter corresponds to the flat prior for the edge counts,
i.e. ((B(B+1)/2 E)). We need to turn this off for the intermediate layers,
since this is replaced by the SBM at the upper layer.

I expected `edges_dl` at the lowest to be something nonzero, but _less_ than
`69.21645383885243`, i.e. the exact negative logarithm of Eq. (40) of the
PRE paper, using B=6 and e=254 (the number of edges of the “lesmis”
dataset). Am I thinking it the right way? :face_with_monocle: <— this is the 2nd question.

No, `edges_dl` corresponds precisely to Eq. 40, this is why it needs to be
disabled in the intermediate layers.

Best,
Tiago

Now I understand that `edges_dl` specifically encodes the flat prior. I have
2 following questions:

:thinking:- How could I access the terms in Eq.(41) of the PRE paper, i.e. each term
is the level-wise entropy of edge counts, as Eq.(42) describes?

For the "lesmis" dataset, the bottom-most layer has the entropy:

nested_state.level_entropy(0)

Out[•]: 630.133156768878

This is exactly the sum of these three entropic terms: "adjacency"
(332.24632), "degree_dl" (170.10951), and "partition_dl" (127.77732). I
could not find a rationale about the missing entropy for edge counts.

:thinking:- I found that `nested_state.levels[0].entropy(deg_entropy=True) -
nested_state.levels[0].entropy(deg_entropy=False) < 0`. This command is
expected to print the negative logarithm of Eq.(28) of the paper, which is
positive. I am not sure what went wrong.

Thanks,
Tzu-Chi

Now I understand that `edges_dl` specifically encodes the flat prior. I have
2 following questions:

:thinking:- How could I access the terms in Eq.(41) of the PRE paper, i.e. each term
is the level-wise entropy of edge counts, as Eq.(42) describes?

These are given by the different hierarchy levels, level_entropy(1),
level_entropy(2), etc.

For the "lesmis" dataset, the bottom-most layer has the entropy:

nested_state.level_entropy(0)

Out[•]: 630.133156768878

This is exactly the sum of these three entropic terms: "adjacency"
(332.24632), "degree_dl" (170.10951), and "partition_dl" (127.77732). I
could not find a rationale about the missing entropy for edge counts.

This is given by the upper layers, as answered above.

:thinking:- I found that `nested_state.levels[0].entropy(deg_entropy=True) -
nested_state.levels[0].entropy(deg_entropy=False) < 0`. This command is
expected to print the negative logarithm of Eq.(28) of the paper, which is
positive. I am not sure what went wrong.

No, 'deg_entropy` controls the degree part of the likelihood, not the prior.
The parameter you want is `degree_dl`.

Best,
Tiago

Thanks you, now I could rationalize how these level_entropies work.

Perhaps lastly, I would like to make sure what `deg_entropy` returns. When
it means the degree part of the likelihood, does it compute the negative log
of the numerator of Eq. (2) of the PRE paper?

Its value could not be justified by the term-by-term summation of these
gammaln(k)'s.

Warmest thanks,
Tzu-Chi

Yes, that's right.