Space efficient solution for saving blocks states to reproduce later

diogro · March 27, 2025, 2:29pm

Hey everyone,

I’m looking for a space-efficient solution for saving the block state of a nested partition to use later, but I’m having trouble with the serialization. I tried pickling the blockstate, but in graph-tool 2.68 this leads to an error:

import pickle
from graph_tool.all import *
g = collection.data["celegansneural"]
state = minimize_nested_blockmodel_dl(g)
block_state = state.get_bs()
block_state # This looks fine, I can use it to reconstruct the NestedBlockState object

new_state = NestedBlockState(g, bs=block_state)
new_state.get_bs() # same as before

But if I save this block state to a pickle, I can’t recover it due to the original graph reference.

with open('test_block.pkl', 'wb') as fh:
    pickle.dump(block_state, fh)

with open ('test_block.pkl', "rb") as fh:
    bs = pickle.load(fh)

# This leads to an error:
new_state_from_pickle = NestedBlockState(g, bs=bs)

I’m pretty sure this used to work when get_bs() returned a list of numpy arrays (as the documentation states), but now it returns a list that starts with a VertexPropertyMap followed by a series of PropertyArray corresponding to the block state. I tried converting these to np.arrays and passing them to NestedBlockState(), but the new state ends up with a different block state.

I suppose I can save the full NestedBlockState object, but I wanted to avoid that because some of the graphs I’m working with are big enough that this would generate large intermediary files.

Thanks!

tiago · March 30, 2025, 6:54am

Just do the following after the hierarchical partition has been unpickled:

bs[0] = g.own_property(bs[0])

(And please update from 2.68! That’s ancient.)