Hello,
I'm studying political coalitions over time (three time slices) and across discrete types of relationships (collaboration, ideological similarity, and discursive similarity). Since graph-tool offers (the only) principled approach that does not require arbitrary user-specified parameters (such as 'resolution' or 'coupling'), I would very much like to use it for my analysis.
I'm trying to fit a SBM to a graph with edges divided into distinct layers (three time slices X three types of relationships). Each edge has a discrete weight in the range of [0, 3], with weight referring to the 'strength' of the relationship. Not all nodes participate in all layers.
I'm interested in how/if the block membership varies across layers that I expect to be somewhat dependent on one another (hence I would like to avoid fitting a separate SBM to each layer). In my ideal scenario, I would be using the NestedBlockState with LayeredBlockState as base and allowing for overlaps across layers:
state=NestedBlockState(g, base_type=LayeredBlockState, state_args=dict(deg_corr=True, overlap=True, layers=True, ec=g.ep.layer, recs=[g.ep.weight], rec_types=["discrete-binomial"]))
dS, nmoves=0, 0
for i in range(100):
ret=state.multiflip_mcmc_sweep(niter=10)
dS+=ret[0]
nmoves+=ret[1]
print("Change in description length:", dS)
print("Number of accepted vertex moves:", nmoves)
mcmc_equilibrate(state, wait=1000, mcmc_args=dict(niter=10), verbose=True)
bs=[]
def collect_partitions(s):
global bs
bs.append(s.get_bs())
mcmc_equilibrate(state, force_niter=10000, mcmc_args=dict(niter=10), verbose=True, callback=collect_partitions)
pm=PartitionModeState(bs, nested=True, converge=True)
pv=pm.get_marginal(g)
bs=pm.get_max_nested()
state=state.copy(bs=bs)
Two questions:
Q1)
This is taking very long (days) on a high performance cluster and actually I haven't seen it finishing up for once. Is my model specification correct or simply too complex? If the latter, is there a better strategy to perform my analysis? All I can think of is to revert to a non-nested model and/or reduce the number of layers and/or edges:
state=LayeredBlockState(g, deg_corr=True, overlap=True, layers=True, ec=g.ep.layer, recs=[g.ep.weight], rec_types=["discrete-binomial"])
dS, nattempts, nmoves=state.multiflip_mcmc_sweep(niter=1000)
print("Change in description length:", dS)
print("Number of accepted vertex moves:", nmoves)
mcmc_equilibrate(state, wait=1000, nbreaks=2, mcmc_args=dict(niter=10), verbose=True)
bs=[]
def collect_partitions(s):
global bs
bs.append(s.b.a.copy())
mcmc_equilibrate(state, force_niter=10000, mcmc_args=dict(niter=10), verbose=True, callback=collect_partitions)
pm=PartitionModeState(bs, converge=True)
pv=pm.get_marginal(g)
Q2)
Say that the model runs and yields a neat NestedBlockState (or LayeredBlockState) object with X overlapping blocks, 9 layers, degree-corrected, with 2 edge covariates, for graph <Graph object, undirected, with 220 vertices and 11322 edges, 2 internal vertex properties, 4 internal edge properties>.
Now, in previous posts there are mentions about the number of nodes multiplying by the number of layers when 'overlap=True', but in my experiments with only two layers, this does not seem to be the case and I struggle to find a way to extract the layer-specific block and respective memberships of nodes.
I've tried iteration over vertices, but what I eventually need is a 2d_array [y=all nodes in g, x=all layers in g, values=most likely block membership] in order to map how/if the block membership of each node varies across those layers in which they participate (ideally somehow denoting non-participation in the layer with NaN or some unique value). Is there some way to get there?
l=state.get_levels()[0] #NestedBlockState
B=l.get_nonempty_B()
bv=l.get_overlap_blocks()[0]
z=np.zeros((g.num_vertices(), B)) #to be filled with most likely block membership with potential overlaps across ALL layers, but I need the layer-specific information
for i in range(g.num_vertices()):
z[int(i), :len(z[v])]=bv[i].a+1
for v in range(g.num_vertices()*max(g.ep.layer)+1): #to see if the number of vertices multiplied by layers, which it didn't
print(bv[v])
Many thanks,
Arttu