Question about entropy

haiko.lietz · April 27, 2021, 9:50am

I am going to compare NestedBlockState.entropy() of the two run, but I am not sure this is correct.
How should I take into account the fact that the networks are slightly different?

Would normalization make the two entropies comparable? I'd be interested to hear opinions about using, for normalization, the entropy of a NestedBlockState where each node is in its own group.

Best

Haiko

attachment.html (2.09 KB)

tiago · April 27, 2021, 11:07am

The description length (DL) tells you how much information is needed to
encode both the network and the model parameters. If we compare the DL
for the same network but different models, this tells which model most
compresses the data. But if we compare two different networks with two
different models, this tells us very little, because it mixes a
comparison of which network is more regular with the quality of fit of
each model.

The results of this kind of comparison is often trivial: the more nodes
and edges, the higher will be the DL.

You *could* compute something like the DL per edge in order to compare
two networks, but since the DL is not a linear function of the number of
nodes or edges, it is difficult to put this evaluation on solid
statistical grounds.

Best,
Tiago

haiko.lietz · April 27, 2021, 12:02pm

> > I am going to compare NestedBlockState.entropy() of the two run,
> > but I am not sure this is correct.
>
> > How should I take into account the fact that the networks are
> > slightly different?
>
> Would normalization make the two entropies comparable? I'd be
> interested to hear opinions about using, for normalization, the
> entropy of a NestedBlockState where each node is in its own group.

The description length (DL) tells you how much information is needed
to encode both the network and the model parameters. If we compare
the DL for the same network but different models, this tells which
model most compresses the data. But if we compare two different
networks with two different models, this tells us very little, because it
mixes a comparison of which network is more regular with the quality
of fit of each model.

The results of this kind of comparison is often trivial: the more nodes
and edges, the higher will be the DL.

You *could* compute something like the DL per edge in order to
compare two networks, but since the DL is not a linear function of the
number of nodes or edges, it is difficult to put this evaluation on solid
statistical grounds.

Thanks Tiago,

I see that this could be an option. But how about my proposal?

The 'polbooks' dataset has 105 nodes. An SBM with one block (B=1) has a DL of about 1550 bits. The DL is minimized (DL_min=1300) for B=5. When each node is in its own block (D=105), DL is maximized (DL_max=1950). Can't I make states of different graphs comparable by taking DL_min/DL_max? It seems like a straightforward application of normalized entropy (Entropy (information theory) - Wikipedia) to me.

All, Tiago fixed a bug in the mailing list backend. It caused my email to arrive four times. I'm sorry for flooding your mailbox.

Best wishes

Haiko

tiago · April 27, 2021, 1:15pm

It's difficult to comment, because I don't know what the objective of
the comparison is.

If you compute the ratio of the minimum DL with the DL for B=1, this
would give you the compression ratio when compared to a baseline random
graph model.

If you compare this ratio between two networks of two different sizes,
this gives you an idea of how more random one is versus the other, when
compared to a fully random graph with the same density, but no deeper
insight.

Best,
Tiago

haiko.lietz · April 27, 2021, 3:15pm

> The 'polbooks' dataset has 105 nodes. An SBM with one block (B=1) has a DL of about 1550 bits. The DL is minimized (DL_min=1300) for B=5. When each node is in its own block (D=105), DL is maximized (DL_max=1950). Can't I make states of different graphs comparable by taking DL_min/DL_max? It seems like a straightforward application of normalized entropy (Entropy (information theory) - Wikipedia) to me.

It's difficult to comment, because I don't know what the objective of the comparison is.

If you compute the ratio of the minimum DL with the DL for B=1, this would give you the compression ratio when compared to a baseline random graph model.

If you compare this ratio between two networks of two different sizes, this gives you an idea of how more random one is versus the other, when compared to a fully random graph with the same density, but no deeper insight.

My objective is to compare the extent to which given networks are in the ordered regime. In this sense, DL_min/DL_B=1 works because it measures the distance to disorder.

Thx for the input

Haiko