Machine-dependent solutions

Hi,

I am using:

  1. graph-tool version 2.79
  2. The following code is run in two machines:
    2.1. MacBook Air M1 with arm64 architecture
    2.2. A computing cluster linux with x86_64 architecture (model AMD EPYC 7742 64-Core Processor)

I am creating a synthetic adjacency matrix A with a given ground truth community structure, and I am looking for its community structure using the simplest model possible. I attach a minimal working example:

import numpy as np
import graph_tool.all as gt

# We fix the seed
seed = 37

np.random.seed(seed)
gt.seed_rng(seed)

# Adjacency matrix 
n = 500
A = np.zeros((n,n))
c = np.random.choice([2.,1.,0.],p=[0.5,0.2,0.3],size=n)
for i in range(n):
    for j in range (i+1,n):
        if c[i]==c[j]:
            A[i,j] = np.random.choice([1.,0.],p=[0.8,0.2])
            A[j,i] = A[i,j]
        else:
            A[i,j] = np.random.choice([1.,0.],p=[0.2,0.8])
            A[j,i] = A[i,j]

# Creation of a graph object in graph-tool
G = gt.Graph()
# We add the nodes
N = A.shape[0]
G.add_vertex(N)

for i in range(N):
    for j in range(N):
        if A[i, j] != 0:  # Only add an edge if there is a non-zero weight
            e = G.add_edge(i, j)   # Add directed edge from i to j

state = gt.minimize_blockmodel_dl(G) 

print(state.entropy())

With this very simple example, I find that the two machines give different solutions. Specifically, the MacBook finds the ground truth solution, with a value ~177128 of the entropy, but the computing cluster doesn’t, finding a solution with entropy ~177379.

I find that the entropy difference grows with the size of the network, and it happens also if the network is wighted, directed and signed and if the model is nested, layered and degree-corrected. I also checked that a different linux laptop with the same OS and the same architecture as the computing cluster finds the same solution as the MacBook, showing that it is not a OS/architecture problem. Finally, it happens at least with versions 2.45 and 2.77 of graph-tool, apart from the 2.79 version I am using right now.

What may be happening?

Thank you in advance.
Miguel

The "weight" property map that you define above is completely irrelevant to the minimize_blockmodel_dl() function, since it ignores it. Could you please remove this unnecessary feature from your minimal example?

Also, can you please give more details on the alleged differences that you find between both machines? The function minimize_blockmodel_dl() is stochastic, so it will not always give the same answer. Also, minor differences in the compiled code could result in different sensitivity to the seed — although this is unexpected for the exact same version of grraph-tool.

How did you install graph-tool in MacOS? Did you use conda or macports?

The “weight” property map is, indeed, unnecessary for this minimal example, it remained there due to the tests I have been doing for the layered version of the model. However, both machines should give the same result independently from this property. I have removed it and still I find the same problem.

Regarding the differences between both machines, I find that the MacBook Air M1 finds the ground truth partition: Specifically, this is the confusion matrix:

[  0 139   0]
[102   0   0]
[  0   0 259]

In which rows are the true labels, and the columns are the labels assigned by the SBM. However, when I run the same code in the computing cluster, this is the confusion matrix:

[  0  75   0  64   0   0   0]
[ 20   0   0   0   0  82   0]
[  0   0   7   0 242   0  10]]

Again, the rows are the true labels and the columns are the SBM communities. Clearly, the model is not finding the ground truth solution. This explains that the value of the Minimum Description Length in this second case is larger, as I explain in my previous comment.

I am fully aware that the algorithm is stochastic, but, as you can see in the code, the seed is fixed from the very beginning for both machines, and still they give different results.

In any case, I did different analyses averaging over realizations precisely to determine if this result was independent from the seed, and the algorithm run in the computing cluster systematically finds a solution with a poorer quality (larger Minimum Description Length) and further from the ground truth.

In both cases, Linux and MacOS, the installation is done using conda.

Comparing single runs like this is not meaningful and does not give useful evidence for a bug.

If I run your code with different seeds I get the following description length values:

178795.5415366061
175733.94277783306
177405.39089366863
176144.14548998783
176194.2606940555
176638.37036697305
179603.77395326414
175194.29108529002
177423.15610003835
174530.8886319496
175212.01196509646

So, while I can reproduce the same description length 177379 you find with a seed of 37 (on a GNU/Linux machine), I can also get much lower values on different runs.

The lack of portability of the RNG seeding across different CPU architectures/compilers is not unheard of, however this should already be fixed in the PCG library used by graph-tool:

However, this type of problem only affects the seeding, not the correctness of the algorithms.

So far, you have not given me any evidence that the code is giving statistically different answers in the different machines.

Do you have any evidence that the answers are systematically different in both machines?

Hello again,

Thank you for the information about the seeding portability.

Regarding the persistence of the difference in the entropy between machines, I find that the difference is systematic, as can be seen, for instance, in the example figure I attached:

fig1.pdf (16,0 KB)

After doing some analyses I discovered that I was having some version issues (in summary, I had a problem during the installation, and the version I was running in the server was a different one from the one I thought I was running, I am sorry for that). So, there is a systematic difference in the average entropy found by the algorithm over realizations, but the difference comes from the difference in the graph-tool version, not from the machine. I explored this further, and I found that for the specific example matrix I am analyzing now the average entropy values for each graph-tool version are:

v2.59 ~ 24730
v2.63 ~ 24790
v2.68 ~ 24810
v2.69 ~ 24810
v2.74 ~ 24930
v2.79 ~ 24930

In all cases, the 95% confidence intervals for these averages are around ±10, so the differences between versions are not arising due to fluctuations in the average values. Instead, there is a systematic increase in the average value of the entropy depending on the graph-tool version. Do you have any insights on why this is happening?

Thank you in advance.

The code has changed over the years. Although the general principle is the same, certain choices, details, and defaults have changed over time.

In general, there’s no expectation that the results from older graph-tool versions should be fully identical to the newest version.

In doubt, you should always prefer the newest version.