Weighted SBM | weight prediction

Adrien_Dulac · September 26, 2018, 12:43pm

Dear all,

I am a bit confused about the use of the weighted network models for a
weight prediction task;

Suppose we have a weighted network where edges are integers. We fit a
SBM with a Poisson kernel as follows:

data = gt.load_graph(...) # The adjacency matrix has integer entries,

and weights greater than zero are stored in data.ep.weights. state =
gt.inference.minimize_blockmodel(data, B_min=10, B_max=10, state_args=
{'recs':[data.ep.weights], 'rec_types' : ["discrete-poisson"]}) |

My question, is how can we obtain, from |state|, a point estimate of the
Poisson parameters in order to compute the distribution of the weights
between pairs of nodes.

Regards,
Adrien Dulac

attachment.html (4.05 KB)

tiago · September 26, 2018, 1:21pm

It's not this simple, since the model is microcanonical and contains
hyperpriors, etc. The easiest thing you can do is compute the conditional
posterior distribution of an edge and its weight. You get this by adding the
missing edge with the desired weight to the graph, and computing the
difference in the state.entropy(), which gives the (un-normalized) negative
log probability (remember you have to copy the state with
state.copy(g=g_new), after modifying the graph). By normalizing this over
all weight values, you have the conditional posterior distribution of the
weight.

(This could be done faster by using BlockState.get_edges_prob(), but that
does not support edge covariates yet.)

Best,
Tiago

Adrien_Dulac · February 3, 2019, 2:28am

Dear Tiago,
thank you for this suggestion.

I tried but I am not sure of the results that I got, maybe my
computation is wrong ?
I proceeded as follows:

[...] entropy = state.entropy() e = g.add_edge(x,y) g.ep.weights[e] =

42 new_state = state.copy(g=g, recs=[g.ep.weights],
rec_types=['discrete-poisson']) new_entropy = new_state.entropy() # Here
is the kind of value that I obtain given for the entropy in my working
example # where the graph has N = 167, E = 5787, max weight = 1458, mean
weight = 14 (this is the manufacturing email network from KONECT")
entropy Out[552]: 72938.4714059238 In [553]: new_entropy.entropy()
Out[553]: 109646.67346672397 |

Thus, as far as I understand, to compute the conditional posterior
distribution of the weight I set, we do |np.exp(entropy - new_entropy)|.
But as the difference is big, the exponential is always zero.

I tried with different nodes and weight but always obtain the same kind
of results.

I wonder if there is not an error in my approach in order to compute
the probability of a missing edge with a given covariate/weight ?

Thanks,
adrien

attachment.html (8.47 KB)

tiago · February 28, 2019, 11:52pm

The probability is only _proportional_ to this number. As I said, the
posterior obtained in this way is unormalized. Hence, it does not make sense
to do this for a single edge. You have to do for more than one edge, and
normalize to obtain the _relative_ probability between them. Alternatively,
you can do it for several weight values for the same edge, and then
normalize between them.

This is explained with an example in the howto:

https://graph-tool.skewed.de/static/doc/demos/inference/inference.html#edge-prediction-as-binary-classification

To get normalized marginal distributions for single edges is necessary to
use the network reconstruction framework, but this still needs to be updated
for edge covariates.