Parallelization

loreeeee9 · July 11, 2024, 1:32pm

Hi everyone.
I just wanted to ask if it would be possible to parallelize this piece of code, ive tried to do it as its said in the documentation but it does not seem to be working

gt.mcmc_equilibrate(state, wait=100, mcmc_args=dict(niter=10), verbose=True)  # initial equilibration of MCMC (that is always performed in the tutorial)
    print(f'{network_file[0]} module refinement one carried out')
    
    # collect nested partitions
    bs = [] # recalling the module partitions at each iteration
    h = [np.zeros(g.num_vertices() + 1) for s in state.get_levels()] # recall the distribution of the number of modules throughout the whole equilibration process
    def collect_data(s):
        bs.append(s.get_bs())
        for l, sl in enumerate(s.get_levels()):
            B = sl.get_nonempty_B()
            h[l][B] += 1
   
    gt.mcmc_equilibrate(state, force_niter=1000, mcmc_args=dict(niter=10),
                        callback=collect_data)
    print(f'{network_file[0]} module refinement two carried out')
    
    pmode = gt.PartitionModeState(bs, nested=True, converge=True)

thankyou for the response!

tiago · July 11, 2024, 2:16pm

Lorenzo,

If you expect any kind of detailed feedback, you need to provide a minimal working example of what you are trying to do. The code fragment you sent isn’t even even property indented, and is not complete.

In any case, since the code does not include any loops, it’s unclear what you are trying to parallelize.

The internal MCMC loop itself cannot be parallelized, since the algorithm is serial.

loreeeee9 · July 16, 2024, 7:58pm

Dear Tiago,
sorry for my silliness, I now understood ur point and thank you for the clarification.
I have one additional doubt though: i have some networks of gene coexpression of about 10-12 k nodes and about 0.01-0.02 dense.
This is the function I am using to retrieve the modular structure, I tried to follow the tutorial as much as possible but what i do not understand is what the initial mcmc_equilibrate is called for.

Thankyou anyway for the response, I am quite new with network science and I am sorry for the disturb

def process_network_nested(args):
    # loading and checking the properties of the network
    g, network_file = args

    print(f'{network_file} loaded')
    g.list_properties()

    state = gt.minimize_nested_blockmodel_dl(g, state_args=dict(deg_corr=True))
    print(f'{network_file} module partition computed')

    gt.mcmc_equilibrate(state, wait=100, mcmc_args=dict(niter=10), verbose=True)  
    # initial equilibration of MCMC (that is always performed in the tutorial)
    print(f'{network_file} module refinement one carried out')

    # collect nested partitions
    bs = [] # recalling the module partitions at each iteration

    def collect_data(s):
        bs.append(s.get_bs())

    gt.mcmc_equilibrate(state, force_niter=1000, mcmc_args=dict(niter=10),
                        callback=collect_data)
    print(f'{network_file} module refinement two carried out')
     
    pmode = gt.PartitionModeState(bs, nested=True, converge=True)  
    
    bs = pmode.get_max_nested()   # Get consensus estimate
    state = state.copy(bs=bs)

    print(f'{network_file} module partition completed')
    return state

tiago · July 16, 2024, 8:09pm

As I explained above, the code you sent cannot be parallelized, since the MCMC implementation is serial.

For future reference: a minimal working example is a self-contained program that can be run and produces the desired behaviour, bug, etc. What you have sent is a just a function, not a minimal working example. (But there’s no need to prepare one, since your question was already answered.)