Hi here,

I hope this email finds you well?

I am currently working on a Python script to detect overlapping communities using the graph-tool library. I copied the first lines of my script from the graph-tool documentation (“minimize_blockmodel_dl — graph-tool 2.68 documentation”).

My purpose is to obtain my overlapping communities in the form of a list of sets, where each community is represented as a set. However, I am encountering an issue where the detected communities do not match the plotted ones. Specifically, the plot shows two communities, but the variable `communities`

in my final output only shows one community, indicating no community.

I believe the problem lies in how I am extracting the communities. I would greatly appreciate your assistance on how to resolve this issue.

Thank you!

The code is below:

```
import numpy as np
import graph_tool.all as gt
import networkx as nx
from cdlib import NodeClustering, evaluation
# Set seeds for reproducibility
np.random.seed(42)
gt.seed_rng(42)
# Load the "football" graph
g = gt.collection.data["polbooks"]
print(g)
# Detect communities using the degree-corrected stochastic blockmodel
state = gt.minimize_blockmodel_dl(g, state=gt.OverlapBlockState)
# Print the state blocks
print(state.get_blocks())
# Draw the graph with the detected communities
state.draw()
# Generate layout for visualization
pos = gt.sfdp_layout(g)
# Ensure we use the correct vertex property map
vertex_blocks = state.get_blocks()
# Extract communities in the form of a list of sets
num_communities = vertex_blocks.a.max() + 1
communities = [set() for _ in range(num_communities)]
for v in g.vertices():
block_id = vertex_blocks[v]
communities[block_id].add(int(v))
# Filter out empty communities
non_empty_communities = [community for community in communities if community]
```

1 Like

Hello

Specifically, the **OverlapBlockState** returns a membership vector where each node can belong to multiple communities, unlike the non-overlapping version. This means you need to handle multiple memberships correctly when extracting the communities.

Below is modified version of your code that should properly extract overlapping communities:

```
import numpy as np
import graph_tool.all as gt
# Set seeds for reproducibility
np.random.seed(42)
gt.seed_rng(42)
# Load the "polbooks" graph
g = gt.collection.data["polbooks"]
# Detect communities using the degree-corrected stochastic blockmodel with overlap
state = gt.minimize_blockmodel_dl(g, state_args=dict(overlap=True))
# Draw the graph with the detected communities
state.draw()
# Extract communities in the form of a list of sets
vertex_blocks = state.get_blocks()
num_blocks = vertex_blocks.max() + 1
communities = [set() for _ in range(num_blocks)]
for v in g.vertices():
memberships = vertex_blocks[v]
for block_id in memberships:
communities[block_id].add(int(v))
# Filter out empty communities
non_empty_communities = [community for community in communities if community]
# Print the non-empty communities
print("Detected overlapping communities:")
for i, community in enumerate(non_empty_communities):
print(f"Community {i + 1}: {community}")
```

This version properly handles overlapping community detection and outputs the communities in the correct format.

Thank you

nolanmaris

Good Morning @nolanmaris,

Thank you very much for the solution provided.

However, it is still not working. When I run the code, I get the error “AttributeError: ‘VertexPropertyMap’ object has no attribute ‘max’”. Moreover, even when I set a random value for num_blocks, I always get some errors. Could you please run the code yourself to see what I mean?

Thank you.

Good evening @nolanmaris ,

Thank you once more for the provided code.

Using what you have given before, I tried to solve some issues to make the code run.

In the end, I get the following code that runs. However, to be honest, I am not yet totally convinced. On all the graphs to which I applied the algorithm, including the overlapping LFR benchmarks (LFR graph with overlapping ground truth), I obtain non-overlapping communities. Maybe Professor @tiago can provide some insight into this. I think that among all the graphs I have used to test my source code, there should be at least one that detects overlapping communities.

Because I am in the process of writing a paper urgently, I would like to have the results of the SBM in the overlapping case. I am concerned that I might not have handled the overlapping properly in my code, which means I might be comparing a non-overlapping method (that I think is overlapping) to other overlapping algorithms.

You can find the code below.

```
import numpy as np
import graph_tool.all as gt
# Set seeds for reproducibility
np.random.seed(42)
gt.seed_rng(42)
# Load the “lesmis” graph
g = gt.collection.data["polbooks"]
# Detect communities using the degree-corrected stochastic blockmodel with overlap
state = gt.minimize_blockmodel_dl(g, state=gt.OverlapBlockState)
# Draw the graph with the detected communities
state.draw()
# Extract overlapping communities
bv, bc_in, bc_out, bc_total = state.get_overlap_blocks()
# Initialize a dictionary to hold communities
communities = {}
# Initialize a set to hold overlapping nodes
overlapping_nodes = set()
# Assign vertices to their respective overlapping communities
for vertex in g.vertices():
memberships = bv[vertex] # This gives the block memberships of the vertex
if len(memberships) > 1:
overlapping_nodes.add(int(vertex))
for membership in memberships:
if membership not in communities:
communities[membership] = set()
communities[membership].add(int(vertex))
# Print the overlapping nodes
print("Overlapping nodes:", overlapping_nodes)
# Print the communities with overlaps
for community, members in communities.items():
print(f"Community {community}: {members}")
```

Sincerely,

Martin WAFFO KEMGNE