Which state should I copy when doing merge-split on model with weight transformation

jstonge · March 21, 2021, 2:33pm

Hi all,

In the following section:
https://graph-tool.skewed.de/static/doc/demos/inference/inference.html#edge-weights-and-covariates
<https://graph-tool.skewed.de/static/doc/demos/inference/inference.html#edge-weights-and-covariates>

Tiago shows how to infer the best model of `foodweb_baywet` between the
`real-exponential` model and the `log-normal` model, each being improved by
the merge-split algorithm. My question has to do with which *state* one
should copy when applying the merge-split algorithm. In the `log-normal`
example model, we have:

y = g.ep.weight.copy()
y.a = log(y.a)

state_ln = gt.minimize_nested_blockmodel_dl(g, state_args=dict(recs=[y],
                                                              
rec_types=["real-normal"]))

state_ln = state.copy(bs=state_ln.get_bs() + [np.zeros(1)] * 4,
sampling=True)

for i in range(100):
    ret = state_ln.multiflip_mcmc_sweep(niter=10, beta=np.inf)

-state_ln.entropy() # ~7231

But if I copy the state_ln object instead:

state_ln = gt.minimize_nested_blockmodel_dl(g, state_args=dict(recs=[y],
                                                              
rec_types=["real-normal"]))

state_ln = *state_ln*.copy(bs=state_ln.get_bs() + [np.zeros(1)] * 4,
sampling=True)

for i in range(100):
    ret = state_ln.multiflip_mcmc_sweep(niter=10, beta=np.inf)

-state_ln.entropy() # ~4690

There is a big difference between the description length of the two models.
My understanding is that the *state* in the first example comes from the
previous `real-exponential` model, which means we copy its state then pass
the hierarchy levels of the state_ln model. Is it supposed to be so?
Shouldn't we always copy the state of the models we ran in the first place
to run merge-split algorithm?

Jonathan

tiago · March 21, 2021, 2:39pm

Yes, this is an error in the howto! Thanks for noticing. It will be
fixed in the next version.

Best,
Tiago

jstonge · March 21, 2021, 3:29pm

Thanks for the quick reply! Great, happy to be of some help. This means that
we now get the following:

L1 = -state.entropy()
L2 = -state_ln.entropy() - np.log(g.ep.weight.a).sum()

print("Exponential model:\t", L1)  # Exponential model:	 7201.52
print("Log-normal model:\t", L2)  # Log-normal model:	 7230.95
print(u"ln \u039b:\t\t\t", L2 - L1) # ln Λ:			         29.42

So it is still true that the exponential model does not provide a better fit
for the data, but not by much!

Cheers,
Jonathan