Assortative structure in signed networks (PPBlockState + covariates)

Hi,

Could I please have some of the documentation elaborated on.

I have a signed, weighted graph (edges are floats ranging from -1 to 1) which I want to perform clustering on. Ideally, I want to search for exclusively assortative structures, so maximise the number/sum of positive edges within groups and maximise the number/sum of negative edges between groups.

I see I can search exclusively for assortative structures using the PPBlockstate.

I also see suggestions that I (a) either shift my weights to positive, or (b) add abs(weight) and sign as two separate covariates with normal and binomial distributions.

My question is does it make sense to combine these? It doesn’t seem that the covariates role in graph assortativity will necessarily be defined using this method.

Dear Alison,

Unfortunately, at the moment the strictly assortative model implemented in PPBlockState does not support edge covariates. For that, you need to use BlockState/NestedBlockState/etc, which will find arbitrary mixing patterns.

I plan to add this functionality in the future. If you want to keep track of this — and be notified — then please open a feature request at https://graph-tool.skewed.de/issues.

If you want to use BlockState/NestedBlockState in your case, then indeed you have several options.

For example, if your weights are in the open interval (-1,1) — i.e., the exact values -1 and +1 are not present — then you can:

  1. Convert your weights from the range (-1,1) to (-\infty, \infty) with a transformation of the type y = \operatorname{arctanh}(x), and then you can use the "real-normal" edge covariate type.
  2. Separate your covariates in two values x \to (s,y), where s=\operatorname{sign}(x)+1 \in \{0, 1\} and y=-\log(1-|x|)\in [0,\infty). For s you use "discrete-binomial" with M=1, and for y "real-exponential".

These amount to different generative models, and the most appropriate one will depend on your actual data.

Best,
Tiago

Dear Tiago,

Thank you, that is good to know and very helpful.

My edges lie on the closed interval [-1, 1], could you advise how best to
transform those?

Kind Regards,
Alison

How does the distribution of covariates look like? Is it concentrated on the edges \{+1,-1\}, around 0, or is it very spread?

It is probably more concentrated near zero but with a few approaching 0.9.
It’s a network of correlations so theoretically it should include the ±1.

Then it’s probably fine to make the transformations as:

  1. y=\operatorname{arctanh}(x \times C)
  2. y=-\log(1-|x| \times C)

Where C < 1 is one epsilon away from one, e.g. C=1 - 1/N, where N is the the size of the vectors you used to compute the correlation.