Disambiguation of network partitions across multiple modes of posterior distribution

Hi Prof. Peixoto,

Question 1)
Just wanted to confirm my understanding.
In your paper DOI:10.1103/PhysRevX.11.021003 you propose method to disambiguate communities and find consensus across partitions.
And then you further propose method to reveal multiple consensuses in heterogeneous distributions.
Are the communities still disambiguated across the partition modes?
I believe so, but wanted to confirm once. Figure 8 suggests so.

Question 2)
The method for finding dissensus between partitions works for a single graph. But will it work across multiple graphs with the same set of nodes? My dataset consists of different graphs with the same set of nodes. I can find partition modes for a single graph. But can I concatenate partitions across graphs and find partition modes?

Many thanks,
Govinda

Yes, of course.

The method works on any set of labelled partitions, as long as they all have the same size.

thank you!

So you mean the nodes don’t need to correspond to each other across graphs?

Extending this to overlapping SBM, different graphs have the same nodes but different edges; edges don’t correspond across graphs. Total number of edges are the same across graphs.
So I can find dissensus among OSBM partitions (on half-edges) across different graphs.
Am I right in saying that?

I am asking because of the following:
A partition is a vector of nodes/half-edges. Aligning labels on half-edges that don’t mean the same unit across graphs may be meaningless for group level inferences.
The method makes sense for individual level inference (a single graph).
Or am I missing something?

If it makes sense to compare the different partitions that have the same size, then it makes sense to use the method. Otherwise it doesn’t.

Obviously not, since the different graphs will in general have a different number of half-edges.

1 Like

My code for aligning some partitions seems problematic.
Some labels are not aligned appropriately.

This is the code:

import os
import pickle 
import numpy as np
import pandas as pd
import graph_tool.all as gt

import matplotlib.pyplot as plt
import seaborn as sns
import colorcet as cc

# load data
gt.seed_rng(100)
np.random.seed(100)
with open(f'{os.environ["HOME"]}/Downloads/data_partition.pkl', 'rb') as f:
    [bs_data, g] = pickle.load(f)

# find modes
pmode = gt.ModeClusterState(bs_data)
for _ in range(1):
    pmode.relabel(maxiter=100)
gt.mcmc_equilibrate(pmode, wait=1, mcmc_args=dict(niter=1, beta=np.inf))

# collect mode partitions
def group_modes(pmode):
    mode_df = []
    M = len(pmode.bs)
    for idx_mode, mode in enumerate(pmode.get_modes()):
        omega = mode.get_M() / M
        sigma = mode.posterior_cdev()
        try: ratio = omega/sigma
        except: ratio = 0.0
        df = pd.DataFrame({
            'mode_id':[idx_mode],
            'mode':[mode],
            'omega':[omega],
            'sigma':[sigma],
            'ratio':[ratio],
            'b':[list(mode.get_max(g))],
        })
        mode_df.append(df)
    mode_df = pd.concat(mode_df).reset_index(drop=True)
    return mode_df

mode_df = group_modes(pmode)
display(mode_df)

# show mode partitions
cmap = cc.cm.rainbow
bs = np.stack(mode_df['b'].to_list()).T
fig, axs = plt.subplots(1, 1, figsize=(4, 10))
ax = axs
sns.heatmap(bs, ax=ax, cmap=cmap)
ax.set(title=f'mode partitions', ylabel=f'roi', xlabel=f'mode')

# check alignment
x = bs[:, 3]
y = bs[:, 0]
x_ = gt.align_partition_labels(x=x, y=y)

print(f'before: {x}, \n after: {x_}')

with open(f'{os.environ["HOME"]}/Downloads/results_partition.pkl', 'wb') as f:
    pickle.dump([mode_df], f)

with open(f'{os.environ["HOME"]}/Downloads/results_partition.pkl', 'rb') as f:
    [mode_df] = pickle.load(f)

So I observe that labels in mode 0 don’t align well with other modes, while labels in other modes seems to align well.
mode 0, label 4 aligns more to mode 3, label 1, and mode 0, label 1 aligns more to mode 3, label 4.

Is there anything wrong with my code/understanding?

Many thanks,
Govinda

data_partition.pkl (74.5 KB)

sorry, this is the data file accompanying the code

Hi Prof. Peixoto,
Sorry for pestering,
Please suggest whenever you have time.