Hi Tiago,

If I understood minimize_blockmodel_dl() correctly, the success of its

inference comes from the following KEYs:

KEY-1. Agglomerative membership initialization

KEY-2. Smarter MCMC proposal moves + successful MCMC

equilibration/optimization

KEY-3. Correct expression of the SBM entropy

For (1), the tunable parameters are:

a) shrink_args[“niter”]: attempted block merges for each block node

b) mcmc_multilevel_args[“r”]: the greediness of the agglomeration.

For (2), the tunable parameters are:

c) mcmc_args[“c”]: controls how the proposed move respects the current block

configuration

d) mcmc_args[“d”]: the probability to move to a new and vacant group

e) mcmc_args[“beta”]: the inverse temperature

f) mcmc_args[“niter”]: the number of sweeps to perform. During each sweep, a

move attempt is made for each node

g) mcmc_equilibrate_args[“wait”]: the number of iterations to wait for a

record-breaking event

h) mcmc_equilibrate_args[“nbreaks”]: the number of iteration intervals (of

size `wait`) without record-breaking events necessary to stop the algorithm.

i) anneal_args[“niter”]: the number of steps (in logspace) from the starting

temperature to the final one.

And for (3), the tunable parameters are hidden in the

mcmc_args.entropy_args, whose default settings are clear to me. They

implement the flat prior as detailed in Phys. Rev. E 95, 012317.

I have the following 4 questions.

FIRST - If I just run minimize_blockmodel_dl(g), where g is the graph, what

are the default numbers used by (a)-(i)? I found that many parameters do not

match the default numbers as stated in the documentation. For example, b) is

set to 1.3, where the documentation says 2. And g) is set to 1, where the

documentation says 1000. With `verbose` turned on, reverse engineering

helps. But I would like to hear your response.

SECOND - To prevent bad merges in (KEY-1.), we also allow individual node

moves between each sweep. This means (KEY-1) + (KEY-2) is applied

alternatively. Between each merge (from B’ to B), what is the annealing

scheme used? Is it abrupt cooling all the way through? Or does it let the

MCMC equilibrates first, followed by the abrupt cooling?

THIRD - Followed from the SECOND, It seems that the latter implementation is

best but computationally expensive. However, for the last stage of the merge

algorithm (when we reached the desired B), the latter [equilibration +

cooling] algorithm must be applied. How can we customize the arguments in

minimize_blockmodel_dl() such that we only do abrupt merges between stages

except for the last?

FOURTH - Bisection is an important ingredient to the efficiency of

successful optimization. What is the default formula to determine the

desired block number B to be merged from B0 = N (number of nodes)? In other

words, how do we determine the first rightmost state to bracket the search?

It seems that either N^0.5 or E^0.5 (the resolution limit via the flat prior

for edge count) does not match what I observed.

I hope my questions were not annoying. Thank you again for designing the

many detailed auxiliary functions. They helped me a lot to catch bugs by

these “reverse engineerings”. You are so sweet.

Sincerely,

Tzu-Chi