Updating to v2.57 gives different results

dawe · August 11, 2023, 3:05pm

Hi all,
I’ve recently updated graph-tool from 2.55 to 2.57 and I noticed different results on the same data. In general I obtain “less” blocks at every level of a NSBM hierarchy.
More in detail, I have a graph and minimize 100 NestedBlockStates, then I extract the consensus using PartitionModeState (get_max_nested).
I define the states like this

recs = []
rec_types = []
deg_corr = True
n_init=100
states = [gt.NestedBlockState(g=g,
                                  state_args=dict(deg_corr=deg_corr,
                                  recs=recs,
                                  rec_types=rec_types)) for n in range(n_init)]

the minimization is performed like this

def fast_min(state, beta=np.inf, n_sweep=10, fast_tol=1e-6, max_iter=np.inf, seed=None):
        if seed:
            gt.seed_rng(seed)
        dS = 1e9
        n = 0
        while (np.abs(dS) > fast_tol) and (n < max_iter):
            dS, _, _ = state.multiflip_mcmc_sweep(beta=beta, niter=n_sweep, c=0.5)
            n += 1
        return state

the PartitionModeState is calculated as

pmode = gt.PartitionModeState([x.get_bs() for x in states], converge=True, nested=True)

Now, if I do everything on v2.57 the number of blocks I have in hierarchy is (only first 5)

print([len(set(pmode.get_max_nested()[x])) for x in range(5)])
[3, 40, 2, 4, 2]

If I do everything on v2.55 I have

print([len(set(pmode.get_max_nested()[x])) for x in range(5)])
[107, 24, 7, 4, 2]

If I perform minimization on 2.55 and pmode on 2.57 I have

print([len(set(pmode.get_max_nested()[x])) for x in range(5)])
[107, 25, 7, 4, 2]

whereas minization on 2.57 and pmode on 2.55 gives

print([len(set(pmode.get_max_nested()[x])) for x in range(5)])
[3, 40, 3, 4, 2]

Besides the fact I don’t understand how I can get less blocks at level 0 than level 1, the pmode step has no particular issues, but the minimization has.

tiago · August 18, 2023, 8:19am

Can you please provide a minimal working example (i.e. a self-contained code that I can run) that shows the problem?

Otherwise it becomes very difficult to piece together what may be going on.

dawe · August 18, 2023, 8:54am

Sure!

I’ve uploaded two pickles here:

Both contain a list of 100 NestedBlockState instances for the same graph (which should be loadable from each state, in case you want to use the same graph). They have been obtained with the same process described above, one with gt-2.55 one with gt-2.57.

They should be loadable like

import pickle
import graph_tool.all as gt

states_255 = pickle.load(open("states_gt2.55.pkl", "rb"))
states_257 = pickle.load(open("states_gt2.57.pkl", "rb"))

I also uploaded two pickles with the PartitionModeState from both the lists, although the problem doesn’t seem to be linked to this in particular.

So, a minimal running code would be:

import pickle
import graph_tool.all as gt
import numpy as np

def fast_min(state, beta=np.inf, n_sweep=10, fast_tol=1e-6, max_iter=np.inf, seed=None):
    if seed:
        gt.seed_rng(seed)
    dS = 1e9
    n = 0
    while (np.abs(dS) > fast_tol) and (n < max_iter):
        dS, _, _ = state.multiflip_mcmc_sweep(beta=beta, niter=n_sweep, c=0.5)
        n += 1
    return state


# load states, just to make comparisons and have the graph
states_255 = pickle.load(open("states_gt2.55.pkl", "rb"))
states_257 = pickle.load(open("states_gt2.57.pkl", "rb"))

g = states_255[0].g

# initalize states
recs = []
rec_types = []
deg_corr = True
n_init=100
states = [gt.NestedBlockState(g=g,
                                  state_args=dict(deg_corr=deg_corr,
                                  recs=recs,
                                  rec_types=rec_types)) for n in range(n_init)]

# get some random seeds, it's a silly way to go, but for now...

seeds = np.random.choice(range(n_init**2), size=n_init, replace=False)


# do the minimization. I use ``joblib.Parallel`` for this, but I verified serial
# execution raises the same issue. For simiplicity I am here doing in serial way

for x in range(n_init):
    states[x] = fast_min(states[x], seed=seeds[x])

# get the pmode

pmode = gt.PartitionModeState([x.get_bs() for x in states], converge=True, nested=True)

dawe · August 24, 2023, 12:55pm

Hi again, I’ve seen version 2.58 has been released. Willing to understand if I have the same issue as above I have installed it and performed some tests. First of all I am sorry that the previous code could not reproduce the error. Apparently the issue comes with joblib library and it is linked to the backend. So, if you substitute above

for x in range(n_init):
    states[x] = fast_min(states[x], seed=seeds[x])

with

from joblib import Parallel, delayed, parallel_config

with parallel_config(backend='threading',
                         max_nbytes=None,
                         n_jobs=-1):
    states = Parallel()(
            delayed(fast_min)(states[x], seeds[x]) for x in range(n_init)
        )

you should be able to reproduce my issue. I can confirm this happens on graph-tool=2.57 and with graph-tool=2.58. I noticed that using a different backend solves the issue

with parallel_config(backend='loky',
                         max_nbytes=None,
                         n_jobs=-1):
    states = Parallel()(
            delayed(fast_min)(states[x], seeds[x]) for x in range(n_init)
        )

I may be fine in using loky backed (although it acts weird with our local slurm configuration) as long as it gives me correct results, but it would be interesting to understand why this is happening. As I said, graph-tool=2.55 did not raise the same issue.