Question 1)
Just wanted to confirm my understanding.
In your paper DOI:10.1103/PhysRevX.11.021003 you propose method to disambiguate communities and find consensus across partitions.
And then you further propose method to reveal multiple consensuses in heterogeneous distributions.
Are the communities still disambiguated across the partition modes?
I believe so, but wanted to confirm once. Figure 8 suggests so.
Question 2)
The method for finding dissensus between partitions works for a single graph. But will it work across multiple graphs with the same set of nodes? My dataset consists of different graphs with the same set of nodes. I can find partition modes for a single graph. But can I concatenate partitions across graphs and find partition modes?
So you mean the nodes don’t need to correspond to each other across graphs?
Extending this to overlapping SBM, different graphs have the same nodes but different edges; edges don’t correspond across graphs. Total number of edges are the same across graphs.
So I can find dissensus among OSBM partitions (on half-edges) across different graphs.
Am I right in saying that?
I am asking because of the following:
A partition is a vector of nodes/half-edges. Aligning labels on half-edges that don’t mean the same unit across graphs may be meaningless for group level inferences.
The method makes sense for individual level inference (a single graph).
Or am I missing something?
My code for aligning some partitions seems problematic.
Some labels are not aligned appropriately.
This is the code:
import os
import pickle
import numpy as np
import pandas as pd
import graph_tool.all as gt
import matplotlib.pyplot as plt
import seaborn as sns
import colorcet as cc
# load data
gt.seed_rng(100)
np.random.seed(100)
with open(f'{os.environ["HOME"]}/Downloads/data_partition.pkl', 'rb') as f:
[bs_data, g] = pickle.load(f)
# find modes
pmode = gt.ModeClusterState(bs_data)
for _ in range(1):
pmode.relabel(maxiter=100)
gt.mcmc_equilibrate(pmode, wait=1, mcmc_args=dict(niter=1, beta=np.inf))
# collect mode partitions
def group_modes(pmode):
mode_df = []
M = len(pmode.bs)
for idx_mode, mode in enumerate(pmode.get_modes()):
omega = mode.get_M() / M
sigma = mode.posterior_cdev()
try: ratio = omega/sigma
except: ratio = 0.0
df = pd.DataFrame({
'mode_id':[idx_mode],
'mode':[mode],
'omega':[omega],
'sigma':[sigma],
'ratio':[ratio],
'b':[list(mode.get_max(g))],
})
mode_df.append(df)
mode_df = pd.concat(mode_df).reset_index(drop=True)
return mode_df
mode_df = group_modes(pmode)
display(mode_df)
# show mode partitions
cmap = cc.cm.rainbow
bs = np.stack(mode_df['b'].to_list()).T
fig, axs = plt.subplots(1, 1, figsize=(4, 10))
ax = axs
sns.heatmap(bs, ax=ax, cmap=cmap)
ax.set(title=f'mode partitions', ylabel=f'roi', xlabel=f'mode')
# check alignment
x = bs[:, 3]
y = bs[:, 0]
x_ = gt.align_partition_labels(x=x, y=y)
print(f'before: {x}, \n after: {x_}')
with open(f'{os.environ["HOME"]}/Downloads/results_partition.pkl', 'wb') as f:
pickle.dump([mode_df], f)
with open(f'{os.environ["HOME"]}/Downloads/results_partition.pkl', 'rb') as f:
[mode_df] = pickle.load(f)
So I observe that labels in mode 0 don’t align well with other modes, while labels in other modes seems to align well. mode 0, label 4 aligns more to mode 3, label 1, and mode 0, label 1 aligns more to mode 3, label 4.
Is there anything wrong with my code/understanding?