I am going to compare NestedBlockState.entropy() of the two run, but I am not sure this is correct.
How should I take into account the fact that the networks are slightly different?

Would normalization make the two entropies comparable? I'd be interested to hear opinions about using, for normalization, the entropy of a NestedBlockState where each node is in its own group.

The description length (DL) tells you how much information is needed to
encode both the network and the model parameters. If we compare the DL
for the same network but different models, this tells which model most
compresses the data. But if we compare two different networks with two
different models, this tells us very little, because it mixes a
comparison of which network is more regular with the quality of fit of
each model.

The results of this kind of comparison is often trivial: the more nodes
and edges, the higher will be the DL.

You *could* compute something like the DL per edge in order to compare
two networks, but since the DL is not a linear function of the number of
nodes or edges, it is difficult to put this evaluation on solid
statistical grounds.

> > I am going to compare NestedBlockState.entropy() of the two run,
> > but I am not sure this is correct.
>
> > How should I take into account the fact that the networks are
> > slightly different?
>
> Would normalization make the two entropies comparable? I'd be
> interested to hear opinions about using, for normalization, the
> entropy of a NestedBlockState where each node is in its own group.

The description length (DL) tells you how much information is needed
to encode both the network and the model parameters. If we compare
the DL for the same network but different models, this tells which
model most compresses the data. But if we compare two different
networks with two different models, this tells us very little, because it
mixes a comparison of which network is more regular with the quality
of fit of each model.

The results of this kind of comparison is often trivial: the more nodes
and edges, the higher will be the DL.

You *could* compute something like the DL per edge in order to
compare two networks, but since the DL is not a linear function of the
number of nodes or edges, it is difficult to put this evaluation on solid
statistical grounds.

Thanks Tiago,

I see that this could be an option. But how about my proposal?

The 'polbooks' dataset has 105 nodes. An SBM with one block (B=1) has a DL of about 1550 bits. The DL is minimized (DL_min=1300) for B=5. When each node is in its own block (D=105), DL is maximized (DL_max=1950). Can't I make states of different graphs comparable by taking DL_min/DL_max? It seems like a straightforward application of normalized entropy (https://en.wikipedia.org/wiki/Entropy_(information_theory)#Efficiency_(normalized_entropy)) to me.

All, Tiago fixed a bug in the mailing list backend. It caused my email to arrive four times. I'm sorry for flooding your mailbox.

It's difficult to comment, because I don't know what the objective of
the comparison is.

If you compute the ratio of the minimum DL with the DL for B=1, this
would give you the compression ratio when compared to a baseline random
graph model.

If you compare this ratio between two networks of two different sizes,
this gives you an idea of how more random one is versus the other, when
compared to a fully random graph with the same density, but no deeper
insight.

> The 'polbooks' dataset has 105 nodes. An SBM with one block (B=1) has a DL of about 1550 bits. The DL is minimized (DL_min=1300) for B=5. When each node is in its own block (D=105), DL is maximized (DL_max=1950). Can't I make states of different graphs comparable by taking DL_min/DL_max? It seems like a straightforward application of normalized entropy (https://en.wikipedia.org/wiki/Entropy_(information_theory)#Efficiency_(normalized_entropy)) to me.

It's difficult to comment, because I don't know what the objective of the comparison is.

If you compute the ratio of the minimum DL with the DL for B=1, this would give you the compression ratio when compared to a baseline random graph model.

If you compare this ratio between two networks of two different sizes, this gives you an idea of how more random one is versus the other, when compared to a fully random graph with the same density, but no deeper insight.

My objective is to compare the extent to which given networks are in the ordered regime. In this sense, DL_min/DL_B=1 works because it measures the distance to disorder.