Hi all,
I'm using graph-tool a lot and I usually perform multiple random initializations only to choose, in the end, the solution with lowest entropy. As each init is a separate and independent process, I was thinking to use joblib to parallelize the process. However I noticed something weird. Here we go:
from joblib import delayed, Parallel
import graph_tool.all as gt
# choose a graph
g = gt.collection.data['football']
# set some variables, as the number of inits
n_init = 3
fast_tol = 1e-3
beta = 1000
n_sweep = 10
# define a function for sweep, this is essentially what is found in official docs
def fast_min(state, beta, n_sweep, fast_tol):
dS = 1
while np.abs(dS) > fast_tol:
dS, _, _ = state.multiflip_mcmc_sweep(beta=beta, niter=n_sweep)
return state
# test with standard python list comprehension, this works
pstates = [gt.PPBlockState(g) for x in range(n_init)]
pstates = [fast_min(state, beta, n_sweep, fast_tol) for state in pstates]
selected = pstates[np.argmin([x.entropy() for x in pstates])]
print(gt.modularity(g, selected.get_blocks()))
0.5986403881107808
# test with 'threading' backend in joblib
pstates = [gt.PPBlockState(g) for x in range(n_init)]
pstates = Parallel(n_jobs=3, prefer='threads')(delayed(fast_min)(state, beta, n_sweep, fast_tol) for state in pstates)
selected = pstates[np.argmin([x.entropy() for x in pstates])]
print(gt.modularity(g, selected.get_blocks()))
0.5926606505592532
# test with default backend in joblib