Simulating data for network reconstruction

Hello, I have a question about generating simulated data using graphtool. In the documentation there’s a nice example of reconstruction from simulated dynamic, using the IsingGlauberState, recovered using the IsingGlauberBlockState.
In the same page there’s another type of problem that is reconstruction with empirical data. In that case, the example is given using external dataset. I’d like to know if it is possible, using graph-tool, to generate synthetic empirical data in a way similar to the first example. It seems to me that the NormalState/PseudoNormalBlockState may do the job, but I’m not sure.

Yes, you should use NormalState for that.

Excellent, thank you. Just to be sure I’m doing the right thing, this snippet doesn’t work as I expected:

import graph_tool.all as gt
import numpy as np

g = gt.collection.data["dolphins"]
nstate = gt.NormalState(g)

M = 1000
nX = []
for m in range(M):
    nstate.iterate_sync(niter=10)
    nX.append(nstate.get_state().a.copy())
nX = np.array(nX).T

state = gt.PseudoNormalBlockState(nX)
delta = np.inf
while delta > 1:
    ret = state.mcmc_sweep(beta=np.inf, niter=1)
    delta = abs(ret[0])

print(state.get_block_state().g)

returns this empty graph

<Graph object, undirected, with 62 vertices and 0 edges, at 0x36fcdc9e0>

It isn’t this simple, I’m afraid. Reconstruction is only possible for some parameter ranges of the original model. You created a multivariate Gaussian model with default parameters, i.e. a precision matrix (i.e. couplings) of zero! Of course this is equivalent to an empty graph. The couplings need to be specified, and a sufficiently large number of samples need be drawn.

This kind of procedure needs to be approached with some understanding of what is being done. It will not work by itself.

1 Like

Yep, thanks! I’m exactly trying to understand how this works.

Partially related to that: how did you reconstructed the nematode coexpression network in section V example c of the paper of yours (https://arxiv.org/pdf/2401.01404)? Did you use default parameters fo PseudoNormalBlockState or specified some constrain for theta and x? For example, default theta_range=[(-200, 200), (-inf, inf)] but mean for expression value is typically in range (0, 15), similarly stdev varies in a small range ((0, 5) maybe?).

That work used L_1 regularization. This is explained in the documentation here: Network reconstruction from dynamics and behavior — graph-tool 2.97 documentation

The documentation uses the Ising model, but with for a multivariate normal you would just use PseudoNormalBlockState. There should be no necessity of changing the default ranges in most cases.