Questions about PPBlockState

Hello,
I'm testing the new Planted Partition model in graph-tool on my data, indeed I'm finding interesting results. I have some questions/observations, though.
- PPBlockState returns a relatively large number of partitions on large networks, which is fine and expected. When I use NSBM, instead, I make use of the hierarchy not only because I can "abstract" partitions up to a certain level, but also because the hierarchy has a meaning in my case. Is there (or will it be there) a hierarchical formulation of the PPBlockState?
- I tried multiple initialisations of PPBlockState over my graph, I also tried to increase the iterations of the initial MCMC sweep and I'd say I get very consistent results. Is this expected? I mean, is it known if the PPBlockState converges to a stable solution faster and in a consistent way?
- Does the time required to converge scales with the number of edges as it does for SBM?
- As far as I understand, if the assortativity is the dominant pattern the difference between PP and NSBM is negligible. I don't know how to quantify "negligible" as the differences in entropies are at least in the order of 1e2 in the cases I tested (seems pretty large to me); I would be happy to switch to PP, also given the shorter runtime so far, but I'm a bit concerned about these differences.

Best

d

Hi Davide,

Hello,
I'm testing the new Planted Partition model in graph-tool on my data, indeed I'm finding interesting results. I have some questions/observations, though.
- PPBlockState returns a relatively large number of partitions on large networks, which is fine and expected. When I use NSBM, instead, I make use of the hierarchy not only because I can "abstract" partitions up to a certain level, but also because the hierarchy has a meaning in my case. Is there (or will it be there) a hierarchical formulation of the PPBlockState?

A hierarchical prior for the PP model is certainly feasible, and it is
something that could come up in the future, but I can't promise when.

- I tried multiple initialisations of PPBlockState over my graph, I also tried to increase the iterations of the initial MCMC sweep and I'd say I get very consistent results. Is this expected? I mean, is it known if the PPBlockState converges to a stable solution faster and in a consistent way?

This depends a lot on the underlying data. If the model is a good fit,
then this consistency is expected, otherwise it's not. It will not
necessarily behave like this for every data.

- Does the time required to converge scales with the number of edges as it does for SBM?

Like in the SBM, the MCMC sweeps take time proportional to the number of
edges, but the multiplicative factor is smaller, since the model is simpler.

- As far as I understand, if the assortativity is the dominant pattern the difference between PP and NSBM is negligible. I don't know how to quantify "negligible" as the differences in entropies are at least in the order of 1e2 in the cases I tested (seems pretty large to me); I would be happy to switch to PP, also given the shorter runtime so far, but I'm a bit concerned about these differences.

I do not recommend simply switching to PP for every analysis. As was
described in the paper, the SBM is still a more powerful model, that is
capable of better capturing the network structure in a wider variety of
cases.

To answer your question, you can test whether the two models give
similar answers by comparing their partitions. You can use the
partition_overlap() function for that.

Comparing the description length is useful to select the best fitting
model, but not to tell if they give similar answers.

Best,
Tiago

Hi Tiago,

Hi Davide,

Hello,
I'm testing the new Planted Partition model in graph-tool on my data, indeed I'm finding interesting results. I have some questions/observations, though.
- PPBlockState returns a relatively large number of partitions on large networks, which is fine and expected. When I use NSBM, instead, I make use of the hierarchy not only because I can "abstract" partitions up to a certain level, but also because the hierarchy has a meaning in my case. Is there (or will it be there) a hierarchical formulation of the PPBlockState?

A hierarchical prior for the PP model is certainly feasible, and it is
something that could come up in the future, but I can't promise when.

Would it make sense to get a graph from partitions (where edges are weighted on connectivity among them) and apply NSBM then? This would produce a mixed model where the deepest level is actually a PP, and the hierarchy doesn't assume any constraint on assortativity.

- As far as I understand, if the assortativity is the dominant pattern the difference between PP and NSBM is negligible. I don't know how to quantify "negligible" as the differences in entropies are at least in the order of 1e2 in the cases I tested (seems pretty large to me); I would be happy to switch to PP, also given the shorter runtime so far, but I'm a bit concerned about these differences.

I do not recommend simply switching to PP for every analysis. As was
described in the paper, the SBM is still a more powerful model, that is
capable of better capturing the network structure in a wider variety of
cases.

You are so right. I have some datasets in which PP doesn't seem to perform as well as NSBM.

To answer your question, you can test whether the two models give
similar answers by comparing their partitions. You can use the
partition_overlap() function for that.

Ok, thanks. Another function I didn't know.

Best,

d

I'm not sure this makes sense.

The nested SBM is a generative model where the upper layers are the
parameters of the layer below. The PP model does not take a full matrix
of connections between groups as parameters, it has only the number of
edges inside communities, and outside. So those two models do not fit
together.