Using an edge covariate as a hyper parameter for the SBM

Hi there,

I am trying to include the edge weights by taking to account an edge
covariate matrix for the nested block model inference. Well, Each time
I run the code on my data set I get slightly different results both in
terms of number of blocks and the nodes in each block.

This is my code:
state = minimize_nested_blockmodel_dl(g,
state_args=dict(recs=[g.edge_properties["weight"]],
rec_types=["discrete-geometric"]))
state.draw(edge_color=prop_to_size(g.edge_properties["weight"],
power=1, log=True),
               ecmap=(matplotlib.cm.gist_heat, .6),
               eorder=g.edge_properties["weight"],
               edge_pen_width=prop_to_size(g.edge_properties["weight"],
1, 4, power=1, log=True),
               edge_gradient=[],
               vertex_text=g.vertex_properties["attribute"],
               vertex_text_position="centered",
               vertex_text_rotation=g.vertex_properties['text_rotation'],
               vertex_font_size=10,
               vertex_font_family='mono',
               vertex_anchor=0,
               output_size=[1024*2,1024*2],
               output="DiscreteGeometric_%s.pdf"%(eventName))

I appreciate if you explain what your approach would be and how I can run
graph-tool using the covariance matrix of edges in order to get
statistically reliable results?

Is there also any way to get the full posterior of each node belonging to
each block?

Thanks in advance.

attachment.html (1.62 KB)

Hi there,

I am trying to include the edge weights by taking to account an edge covariate matrix for the nested block model inference. Well, Each time I run the code on my data set I get slightly different results both in terms of number of blocks and the nodes in each block.

This is because the inference is made using MCMC, which is a stochastic
algorithm. You have to run it multiple times, and select the result with
largest posterior probability (if you only want a point estimate).

This is my code:
state = minimize_nested_blockmodel_dl(g, state_args=dict(recs=[g.edge_properties["weight"]], rec_types=["discrete-geometric"]))
state.draw(edge_color=prop_to_size(g.edge_properties["weight"], power=1, log=True),
               ecmap=(matplotlib.cm.gist_heat, .6),
               eorder=g.edge_properties["weight"],
               edge_pen_width=prop_to_size(g.edge_properties["weight"], 1, 4, power=1, log=True),
               edge_gradient=[],
               vertex_text=g.vertex_properties["attribute"],
               vertex_text_position="centered",
               vertex_text_rotation=g.vertex_properties['text_rotation'],
               vertex_font_size=10,
               vertex_font_family='mono',
               vertex_anchor=0,
               output_size=[1024*2,1024*2],
               output="DiscreteGeometric_%s.pdf"%(eventName))

Although it not important for the questions you have raised, it is not very
useful to post incomplete code. Normally, for troubleshooting purposes, it
is necessary for you to provide a _minimal_ and _self-contained_ program
that anyone could execute and verify the problem you are reporting.

I appreciate if you explain what your approach would be and how I can run
graph-tool using the covariance matrix of edges in order to get
statistically reliable results?

This is covered in detail in the HOWTO:

   https://graph-tool.skewed.de/static/doc/demos/inference/inference.html

and also in many papers, e.g.

   https://arxiv.org/abs/1705.10225
   https://arxiv.org/abs/1708.01432

However, I'm note sure what you mean by "covariance matrix of edges". The
approach in question deals with graphs with edge covariates (a.k.a.
weights). A covariance matrix usually refers to something else.

Is there also any way to get the full posterior of each node belonging to
each block?

This is also explained in detail in the HOWTO:

https://graph-tool.skewed.de/static/doc/demos/inference/inference.html#sampling-from-the-posterior-distribution

Best,
Tiago

In my network, beside to the information of which two nodes create an edge,
I have the information of the time duration which an edge has lasted. I
included this information as weight and used them as the covariate
of the SBM. The results seems more reasonable compared to not considering
any weights. However, the number of blocks changes slightly in each time I
ran my script with the piece of code given before. So I was wondering if I
must run minimize_nested_blockmodel_dl function by determining the higher
number of MCMC iterations as argument, and then I would get more accurate
results with highest confidence interval or I just need to repeat this
function in a loop and then compute the mean number of blocks? I hope my
question makes sense.

Thanks again.
Zahra

attachment.html (5.33 KB)

You should run the algorithm multiple times, and choose the result with the
smallest description length. You get this value via the method state.entropy().

Best,
Tiago

Thanks a lot for the advice!

attachment.html (1.94 KB)

Hi Tiago,

For the non-parametric weighted SBMs, how can I extract the "description
length" from the the state.entropy() method? Is it also equivalent of
having the maximum entropy values after running the algorithm multiple
times ?

I also have a theoretical question: I read most of your recent papers and I
see this statement but I could not find more description why it is the case?
Why do you use the "micro-canonical formulation"? You stated that "it
approaches to the canonical distributions asymptotically". In case you have
explained it in one of your papers, would you kindly refer me to the right
paper?

Thanks in advance.

Best,
Zahra

attachment.html (3.22 KB)

For the non-parametric weighted SBMs, how can I extract the "description
length" from the the state.entropy() method? Is it also equivalent of having
the maximum entropy values after running the algorithm multiple times ?

The entropy() method returns the negative joint log-likelihood of the data
and model parameters. For discrete data and model parameters, this equals
the description length.

For the weighted SBM with continuous covariates, the data and model are no
longer discrete, so this value can no longer be called a description length,
although it plays the same role. However, for discrete covariates, it is the
description length.

I also have a theoretical question: I read most of your recent papers and I
see this statement but I could not find more description why it is the case?
Why do you use the "micro-canonical formulation"? You stated that "it
approaches to the canonical distributions asymptotically". In case you have
explained it in one of your papers, would you kindly refer me to the right
paper?

The microcanonical model is identical to the canonical model, if the latter
is integrated over its continuous parameters using uninformative priors, as
explained in detail here:

    https://arxiv.org/abs/1705.10225

Therefore, in a Bayesian setting, it makes no difference which one is used,
as they yield the same posterior distribution.

The main reason to use the microcanonical formulation is that it makes it
easier to extend the Bayesian hierarchy, i.e. include deeper priors and
hyperpriors, thus achieving more robust models without a resolution limit,
accepting of arbitrary group sizes and degree distributions, etc. Within the
canonical formulation, this is technically more difficult.

Best,
Tiago

Hi Tiago,

Thanks for the explanation. I have another question:

In the "Inferring the mesoscale structure of layered, edge-valued and
time-varying networks", you compared two way of constructing layered
structures: first approach: You assumed an adjacency matrix in each
independent layer. The second method, the collapsed graph considered as a
result of merging all the adjacency matrices together.

I am wondering how I can use graph_tool for the first method? Which method
or class should I use? If there is a class, is it still possible to
consider a graph with weighted edges?

Thanks again.

Regards,
Zahra

attachment.html (3.71 KB)

Hi Tiago,

Thanks for the explanation. I have another question:

In the "Inferring the mesoscale structure of layered, edge-valued and
time-varying networks", you compared two way of constructing layered
structures: first approach: You assumed an adjacency matrix in each
independent layer. The second method, the collapsed graph considered as a
result of merging all the adjacency matrices together.

I am wondering how I can use graph_tool for the first method? Which method
or class should I use?

You have to pass the option "layers=True" to the LayeredBlockState constructor:

https://graph-tool.skewed.de/static/doc/inference.html#graph_tool.inference.layered_blockmodel.LayeredBlockState

If there is a class, is it still possible to consider
a graph with weighted edges?

Yes, it accepts 'recs/rec_types/rec_params' just like the regular BlockState.

Best,
Tiago

Hi Tiago,

Thanks for the reply. In the section (VI) of your paper "Inferring the
mesoscale structure of layered, edge-valued and time-varying networks", you
used the layered stochastic block model for a temporal network. I have a
similar data set which I do not want to fix the membership for the nodes of
different layers to the same block over all layers (nodes can change their
block memberships over time). I am wondering again how I can use graph
tool for this case? Which method or constructor should I use?

Regards,
Zahra

attachment.html (2.77 KB)

It's always the same constructor, LayeredBlockState. To allow the membership
to change across layers, you need to set overlap=True.

Hi Tiago,

I have another naive question: It is not still clear for me how I could
pass different weighted graphs of each timestamp to this LayeredBlockState
constructor while I want that each of these weighted graphs be considered
as one layer of this multilayer network? Because the input is just a single
graph.

Regards,
Zahra

attachment.html (2.01 KB)

Right, you have to collapse them in a single graph with multiple edges. The
time-stamp on the edges (i.e. the "layers") should be stored as a property
map that you pass as the 'ec' parameter to LayeredBlockState.