Question regarding layered SBM

treinz · February 23, 2017, 2:01am

Hi all,

I'm new to the graph theory field and graph-tool package. Can anyone help me with the following questions on SBM of layered graph:

1) In the example shown in https://graph-tool.skewed.de/static/doc/demos/inference/inference.html#edge-layers-and-covariates, the edge covariates for the Les Misérables network is passed via g.ep.value:

state = gt.minimize_blockmodel_dl(g, deg_corr=False, layers=True,
state_args=dict(ec=g.ep.value, layers=False))

In this case, does the constructed layered model automatically detect how many layers there should be in order to obtain a best fit SBM? If so, how can one retrieve the layer membership of each edge? If not, is there a way to do so in graph-tool via other function calls?

2) There's a so called 'independent layers' model discussed in the reference: Peixoto, T. P., Phys. Rev. E, 2015, 92, 042807 and it seems that setting state_args=dict(ec=g.ep.value, layers=True) in the example should use this model instead of the edge covariate model. But it seems from the paper that on is required to input the number of layers ('C' as in Fig. 3 of the reference). So how exactly should I use graph-tool to use the 'independent layers' model? Or is the algorithm capable of automatically detecting 'C' or the number of layers from the data?

Thanks,
Tim

attachment.html (1.74 KB)

tiago · February 23, 2017, 6:39pm

Hi all,

I'm new to the graph theory field and graph-tool package. Can anyone help me
with the following questions on SBM of layered graph:

1) In the example shown in
Inferring modular network structure — graph-tool 2.58 documentation,
the edge covariates for the Les Misérables network is passed via g.ep.value:

state = gt.minimize_blockmodel_dl(g, deg_corr=False, layers=True,
state_args=dict(ec=g.ep.value, layers=False))

In this case, does the constructed layered model automatically detect how
many layers there should be in order to obtain a best fit SBM? If so, how
can one retrieve the layer membership of each edge? If not, is there a way
to do so in graph-tool via other function calls?

Each layer corresponds to a particular value of the g.ep.value property map,
which was passed as the `ec` parameter. There is no need to extract
anything, since this information was provided to the function in the first
place.

2) There's a so called 'independent layers' model discussed in the
reference: Peixoto, T. P., Phys. Rev. E, 2015, 92, 042807 and it seems that
setting state_args=dict(ec=g.ep.value, layers=True) in the example should
use this model instead of the edge covariate model. But it seems from the
paper that on is required to input the number of layers ('C' as in Fig. 3 of
the reference). So how exactly should I use graph-tool to use the
'independent layers' model? Or is the algorithm capable of automatically
detecting 'C' or the number of layers from the data?

The number of layers is determined automatically from the supplied `ec`
parameter.

Best,
Tiago

treinz · March 24, 2017, 12:28am

Hi Tiago,

Thank you for the info. Here's a follow-up question. If I have a series of networks and I'm expecting some clusters of networks in terms of their stochastic block structure, i.e., there exist networks that are similar to each other when compare their block models. I'm trying to compare them and then identify these clusters by using SBM. Is the layered SBM the appropriate way of doing this and if so how should I use the layered SBM to do so? I don't have enough background to fulling appreciate what's in the paper even after I read it thoroughly and I hope you can give me some idea.

Thanks,
Tim

attachment.html (3.32 KB)

Peter_Straka · March 26, 2017, 10:22pm

Do the networks have the same number of nodes? If so, you could

   - define a variable which has a distinct value for each network in your
   series,
   - use this variable as a layer variable
   - see if this formulation is reducing overall description length,
   compared to modelling each network individually.

If description length is reduced, then the layer variable is informative in
forming the blocks. This might not be what you want if you have a time
series, though...
Peter

attachment.html (5.41 KB)

treinz · March 27, 2017, 5:17pm

Hi Peter,

Thanks for your reply. If I understand you correctly, what you said is basically defining a similarity score and cluster the network into layers and run SBM on each layer and then compare?

Thanks,
Tim

Do the networks have the same number of nodes? If so, you could
define a variable which has a distinct value for each network in your series,
use this variable as a layer variable

see if this formulation is reducing overall description length, compared to modelling each network individually.
If description length is reduced, then the layer variable is informative in forming the blocks. This might not be what you want if you have a time series, though...
Peter

Hi Tiago,

Thank you for the info. Here's a follow-up question. If I have a series of networks and I'm expecting some clusters of networks in terms of their stochastic block structure, i.e., there exist networks that are similar to each other when compare their block models. I'm trying to compare them and then identify these clusters by using SBM. Is the layered SBM the appropriate way of doing this and if so how should I use the layered SBM to do so? I don't have enough background to fulling appreciate what's in the paper even after I read it thoroughly and I hope you can give me some idea.

Thanks,
Tim

attachment.html (5.97 KB)

Peter_Straka · March 27, 2017, 10:13pm

Don't know what you mean by similarity score? I also don't really know what
you're trying to do...
but I assume you're looking for patterns and clusters. Blockmodels take the
philosophy that if your network data can be compressed effectively by
fitting a blockmodel, then a blockmodel is likely to be a good model for
how your data were generated. In this paper
<https://arxiv.org/pdf/1504.02381> Tiago explains how you can check if the
time/index/sequence variable for your series of networks contains useful
information. You compare the description lengths without and with that
variable (Section IV). That way you could e.g. give evidence for a change
point in the series.
Hope this helps,
Peter

attachment.html (8.23 KB)

treinz · March 27, 2017, 11:02pm

Hi Peter,

I think I'm confused by how the input and output are related to each other in the layered model. Let's say each network in my data is 1 of the top 3 layers of Fig. 1 of the paper you mentioned. I don't have a well-defined sequence variable for the networks except that I know they're related to each other but not exactly the same. You can think of them as realizations of different perturbed states of the same underlying network but each comes with some experimental noise. I'm expecting the algorithm to tell me how many of these perturbed states are there in my data and what's the SBM for each of these states. I'm thinking maybe the layered SBM can help me with that. But it seems that in order to use the layered model, I have to first collapse all the networks, which I think will lose a lot of information in my data and I don't know how to interpret the output.

Thanks,
Tim

Don't know what you mean by similarity score? I also don't really know what you're trying to do...
but I assume you're looking for patterns and clusters. Blockmodels take the philosophy that if your network data can be compressed effectively by fitting a blockmodel, then a blockmodel is likely to be a good model for how your data were generated. In this paper Tiago explains how you can check if the time/index/sequence variable for your series of networks contains useful information. You compare the description lengths without and with that variable (Section IV). That way you could e.g. give evidence for a change point in the series.
Hope this helps,
Peter

Hi Peter,

Thanks for your reply. If I understand you correctly, what you said is basically defining a similarity score and cluster the network into layers and run SBM on each layer and then compare?

Thanks,
Tim

Do the networks have the same number of nodes? If so, you could
define a variable which has a distinct value for each network in your series,
use this variable as a layer variable

see if this formulation is reducing overall description length, compared to modelling each network individually.
If description length is reduced, then the layer variable is informative in forming the blocks. This might not be what you want if you have a time series, though...
Peter

Hi Tiago,

Thank you for the info. Here's a follow-up question. If I have a series of networks and I'm expecting some clusters of networks in terms of their stochastic block structure, i.e., there exist networks that are similar to each other when compare their block models. I'm trying to compare them and then identify these clusters by using SBM. Is the layered SBM the appropriate way of doing this and if so how should I use the layered SBM to do so? I don't have enough background to fulling appreciate what's in the paper even after I read it thoroughly and I hope you can give me some idea.

Thanks,
Tim

attachment.html (9.53 KB)

tiago · March 27, 2017, 11:18pm

I don't really understand what you want, exactly, and what you mean by
"perturbed states". Forget about the layered SBM for a moment, and try to
explain clearly and succinctly how your data is, and what you want to obtain
in the end.

Best,
Tiago

treinz · March 28, 2017, 12:17am

OK. Fig 1 and its legend in this paper (http://ieeexplore.ieee.org/document/7442167/) is a summary of what I just stated before.

-Tim

attachment.html (1.8 KB)

tiago · March 28, 2017, 8:03am

They describe a multilayer SBM, much like the one which is implemented in
graph-tool (the main difference is that in graph-tool the model is
nonparametric).

If you want to find what they call "strata", i.e. groups of layers, all you
need to do is group layers together, such that the description length, with
the addition of a term, as described in Eq. 17 of
http://dx.doi.org/10.1103/PhysRevE.92.042807.

Best,
Tiago

treinz · March 28, 2017, 10:48pm

I don't have the prior for how the layers should be grouped together. I suppose I need equation 18 also? Are all of these implemented in graph-tool? If so, can you specify the functions in graph-tool for this? It doesn't seem the LayeredBlockState constructor takes multiple graphs or multilayer graphs as input.

Thanks,
Tim

attachment.html (1.32 KB)

tiago · April 4, 2017, 1:45pm

I don't have the prior for how the layers should be grouped together. I
suppose I need equation 18 also? Are all of these implemented in graph-tool?

You don't need a prior; the complete procedure is described in the paper.

Eq. 18 is not implemented in graph-tool yet, but it is easy to compute.

If so, can you specify the functions in graph-tool for this? It doesn't seem
the LayeredBlockState constructor takes multiple graphs or multilayer graphs
as input.

This is covered in the HOWTO:

Inferring modular network structure — graph-tool 2.58 documentation

Best,
Tiago