Hi,
I am using:
- graph-tool version 2.79
- The following code is run in two machines:
2.1. MacBook Air M1 with arm64 architecture
2.2. A computing cluster linux with x86_64 architecture (model AMD EPYC 7742 64-Core Processor)
I am creating a synthetic adjacency matrix A with a given ground truth community structure, and I am looking for its community structure using the simplest model possible. I attach a minimal working example:
import numpy as np
import graph_tool.all as gt
# We fix the seed
seed = 37
np.random.seed(seed)
gt.seed_rng(seed)
# Adjacency matrix
n = 500
A = np.zeros((n,n))
c = np.random.choice([2.,1.,0.],p=[0.5,0.2,0.3],size=n)
for i in range(n):
for j in range (i+1,n):
if c[i]==c[j]:
A[i,j] = np.random.choice([1.,0.],p=[0.8,0.2])
A[j,i] = A[i,j]
else:
A[i,j] = np.random.choice([1.,0.],p=[0.2,0.8])
A[j,i] = A[i,j]
# Creation of a graph object in graph-tool
G = gt.Graph()
# We add the nodes
N = A.shape[0]
G.add_vertex(N)
for i in range(N):
for j in range(N):
if A[i, j] != 0: # Only add an edge if there is a non-zero weight
e = G.add_edge(i, j) # Add directed edge from i to j
state = gt.minimize_blockmodel_dl(G)
print(state.entropy())
With this very simple example, I find that the two machines give different solutions. Specifically, the MacBook finds the ground truth solution, with a value ~177128 of the entropy, but the computing cluster doesn’t, finding a solution with entropy ~177379.
I find that the entropy difference grows with the size of the network, and it happens also if the network is wighted, directed and signed and if the model is nested, layered and degree-corrected. I also checked that a different linux laptop with the same OS and the same architecture as the computing cluster finds the same solution as the MacBook, showing that it is not a OS/architecture problem. Finally, it happens at least with versions 2.45 and 2.77 of graph-tool, apart from the 2.79 version I am using right now.
What may be happening?
Thank you in advance.
Miguel