Multilayer graph with select aggregation over layers

fitzgeraldj · August 17, 2020, 11:41pm

Hi Tiago,

Thanks for an amazing library! I'm trying to use the multilayer aspect of
graph-tool and I have a few questions that I haven't found answers to
online:

1) How do you access parameters inferred for independent models on separate
layers, say for reproducing Figure 5(b) from the paper? Unless you're just
masking the graph and only drawing edges applicable to each layer?

2) I've seen you say several times that it's simple to incorporate binning
of layers into the model, but I haven't been able to figure out how to do so
- a brief example of how to do so would be hugely appreciated! (or some
initial pointers if you're pressed for time)

3) In this binning procedure, is it possible to keep subsets of layers
separate? Each of my layers is actually defined by a (property X, property
Y) combo, and I would like to investigate how property X affects the
grouping of property Y, as well as whether it noticeably impacts the
networks structure.

Apologies if I've missed the obvious! I've been trying to familiarise myself
quickly with the package, and have hugely appreciated that you've coded this
all yourself in the first place!

Thanks,
John

fitzgeraldj · August 20, 2020, 8:50pm

Hi again,

I believe I've answered my first question (clumsily I suppose) by taking the
state inferred from the layered model, then taking e.g.

submodels=[]
lgs = state.get_levels()[0].gs
for l,lg in enumerate(lgs):
submodels.append(gt.NestedBlockState(lg,bs=[lvl.layer_states[l].b for
lvl in state.get_levels()]

but presumably there's a cleaner way of accessing these values.

For my second question, when you say in the paper to apply agglomerative
hierarchical clustering, is there a cleaner way of doing this through
graph-tool than just brute force? I.e. fitting the model for different bins
then comparing description lengths (including subtracting log eqn 18 or 19)
for each possible combination? This seems like it would be quite slow for
non-small networks, especially for a reasonable initial number of layers, so
again I assume there's an alternative.

If not then my third question is answered, as I can just suitably modify the
prior then only trial merges for property Y.

Thanks,
John

tiago · August 25, 2020, 1:47pm

1) How do you access parameters inferred for independent models on separate
layers, say for reproducing Figure 5(b) from the paper? Unless you're just
masking the graph and only drawing edges applicable to each layer?

I don't understand the difference between these two things; they seem
the same to me.

But in short: the second option (masking) seems the most straightforward.

(The LayeredBlockState object keeps this information in a manner that is
most efficient for the inference algorithm, but not for visualization.)

2) I've seen you say several times that it's simple to incorporate binning
of layers into the model, but I haven't been able to figure out how to do so
- a brief example of how to do so would be hugely appreciated! (or some
initial pointers if you're pressed for time)

I assume you mean automatic binning; unfortunately this is not yet
implemented in the library. It's in my TODO list, but you know how that
goes.

Best,
Tiago

fitzgeraldj · August 25, 2020, 7:21pm

OK thanks for getting back to me! Looking forward to the update whenever it
may come.

Best,
John

fitzgeraldj · August 26, 2020, 10:59am

Hi again,

On a similar note, can I quickly clarify that if I am comparing a binned
layer model for multigraphs to a suitable null model, the null model of eqn
(A3) corresponds to that of eqn (A2) with \ell in place of l, and then the
further terms displayed?

That is what would make sense to me (i.e. account for distribution of edges
first to binned layers, then between binned layers and sublayers), but then
would clearly make the additional binned layer terms cancel, reducing (A3)
back to (A2) in terms of the extra factors (but of course with different
priors over parameters, and the binning prior also to take into account). Is
that correct?

Thanks,
John

fitzgeraldj · August 26, 2020, 8:17pm

Actually looking back is it instead that the null model in (A2) tells you
whether the full set of layers provides useful information, then in order to
compare between this and models with different bins you need to incorporate
the additional terms of (A3) (rather than solely the output likelihood)?

Thanks,
John