Good morning everyone,
I hope this message finds you well.
I’m currently using the OverlapBlockState
feature in graph_tool
to detect overlapping communities in networks. While the algorithm works perfectly for some datasets, I’ve been encountering significant issues with others, even those of relatively small size. Here’s an example code snippet to demonstrate the workflow:
python
CopyEdit
import graph_tool.all as gt
# Load the dataset
g = gt.collection.data["polbooks"]
# Apply overlapping SBM
state = gt.minimize_blockmodel_dl(g, state=gt.OverlapBlockState)
# Draw the results
state.draw(pos=g.vp["pos"], output="polbooks_overlap_blocks_mdl.svg")
This works perfectly on datasets such as:
"polbooks"
: 105 nodes, 441 edges"karate"
: 34 nodes, 78 edges"football"
: 115 nodes, 615 edges"power"
: 4,941 nodes, 6,594 edges
However, I face crashes and kernel deaths when working with other datasets, such as:
"celegansneural"
: 297 nodes, 2,359 edges"polblogs"
: 1,490 nodes, 19,090 edges- Jazz dataset
On my local machine, the Jupyter kernel dies or the process crashes. Thinking it might be a memory issue, I tried running the same code on a server with 300 GiB of RAM and 16 CPUs, but I still encounter segmentation faults or crashes.
Given that the problem persists on the server, I’m unsure whether it’s a memory issue or something else, such as the structure of the datasets or a limitation of the algorithm for certain network characteristics.
Has anyone experienced something similar or have any insights into why this might be happening? Any suggestions or workarounds would be greatly appreciated.
Thank you in advance for your help!