Posterior edges probilities for netowrk inferred using PseudoIsingBlockState

FracescoZambelli · February 18, 2025, 10:10am

Dear professor,

I am exploring the section Network reconstruction from dynamics and behavior, trying to apply it to mRNA single cell transcriptomics data. I managed to get the maximum posterior network, but I’d like to also get the probability associated to each possible weight of each edge from the sampling process.

I was trying to use the get_edge_prob or get_edges_probs methods, but both of them returns results that are not compatible with the network that I observe, in particular get_edge_probs returns nan for all the elements of the adjacency matrix outside the diagonal.
I also tried to use mcmc_equilibrate fucntion, as depicted in the Uncertain network reconstruction section, although in that case it was applied on a MeasuredBlockState, but also in this case didn’t work returning an error related to the argument passed to the mcmc_sweep function (ValueError: not enough values to unpack (expected 3, got 2)).

The alternative that I’m currently using is to store the network and the weights at each iteration of the mcmc_sweep function used for the minimization after the system seems to have reached the convergence. Is it a correct approach?

Is it even conceptually right to ask for the single edge probability for this kind of reconstruction task, or we have to reason only on ensembles of networks?

Thanks for your help

Best

Francesco

tiago · February 18, 2025, 12:40pm

Unfortunately, it’s not possible to provide a solution to your problem or an answer to your question with the information provided.

To enable us to understand the situation, you need to provide all the items below:

Your exact graph-tool version.
Your operating system.
A minimal working example (MWE) that shows the problem. The MWE needs to be
a. Self-contained. I.e. it cannot depend on any other context or resource. If it depends on files, they need to be provided.
b. Minimal. I.e. it needs to include only what is necessary to reproduce the problem.

Item 3 above is very important! If you provide us only with a vague description, or only with the part of the code that you believe causes the problem, then it is not possible to understand the context that may have contributed to it.

You need to provide us with a complete example that runs and reproduces the problem you are observing.

To post code, please use triple back-ticks for proper formatting:

for x in range(100):
    print("hello world")

or upload your files as attachments.

FracescoZambelli · February 20, 2025, 4:39pm

Thanks for the quick reply, here the are the information required.

graph_tool 2.79
ubuntu

import graph_tool as gt
from graph_tool.all import *

import numpy as np
import matplotlib.pyplot as plt
import os
import pandas as pd
from tqdm import tqdm
import seaborn as sns

# CREATING DATA
# this is a toy model used for the MWE purposes, actually I'm working with a real dataset
# the samples are not time ordered so i'm not working with a time series

# generate network
g = gt.collection.data["dolphins"]
E = g.num_edges()
N = g.num_vertices()
w = g.new_ep("double", vals=np.random.normal(N/(2*E), .05, E))  # True edge weights
istate = gt.dynamics.IsingGlauberState(g, w=w, beta=1)

# generate Ising data
M = 1000
X = []
for m in range(M):
    istate.iterate_sync()
    X.append(istate.get_state().a.copy())
X = np.array(X).T

# NETWORK INFERENCE

state = gt.inference.PseudoIsingBlockState(X, has_zero=False, self_loops=False) #Ising state with pseudo stationary likelihood distribution
#state.update_entropy_args(xl1=200) # L1 regularization with lambda=200

#vectors to store intermediate results
num_edges = []
deltas = []

niters = 10
beta = 100

with tqdm(total=niters, desc=f"Beta={beta}") as pbar:
    for i in range(niters):
        delta, *_ = state.mcmc_sweep(beta=beta, niter=10)
        deltas.append(delta)
        num_edges.append(state.get_graph().num_edges())
        pbar.set_postfix({"Delta": f"{delta:.3f}", "Num edges": state.get_graph().num_edges()})
        pbar.update(1)

# ANALYSIS OF THE RESULTS

u = state.get_graph()      # reconstructed network
w_r = state.get_x()        # reconstructed weights
t_r = state.get_theta()    # reconstructed thetas

w_unique = np.unique(np.array(w_r.a))

# TRIALS OF SAMPLING NETWORK ENSAMBLE FROM POSTERIOR

# using state.get_edge_prob
# the documentation is not large but as far as I understood it should take as inputs the indexes of the nodes and the weight (one of the possible discrete values given the type of inference algorithm I suppose)
eprobs = np.zeros([len(w_unique),N,N])
for w in range(len(w_unique)):    
    for i in range(N):
        for j in range(N):
            eprobs[w,i,j] = state.get_edge_prob(i,j,w_unique[w])
#it returns all nanas except from the diaonal elments

print(eprobs)

# using state.get_edge_probs
eprobs_1 = []
for i in range(N):
    eprobs_1.append(state.get_edges_prob([[i,j] for j in range(N)]))
eprobs_1 = np.array(eprobs_1)
print(eprobs_1)
# it isvery weird, sometimes running twice the same piece of code returns different results
# in particular the first time returns many nans, then, for some unknown reason, it returns the finite values
# sometimes happens that running it other times, the values even change to other finite values without updating the state

# using mcmc_sweep
# sampling at equlibrium
niters = 10
beta = 10

sampl_adj = []
sampl_w = []
sampl_t = []


with tqdm(total=niters, desc="Sampling at equilibrium") as pbar:
    for i in range(niters):
        delta, *_ = state.mcmc_sweep(beta=beta, niter=1)
        sampl_adj.append(gt.spectral.adjacency(state.get_graph().copy()))
        sampl_w.append(state.get_x())
        sampl_t.append(state.get_theta())
        pbar.update(1)

sampl_adj = np.array([s.toarray() for s in sampl_adj])
print(sampl_adj)
# this of course works but I'm not sure it provides a good way for sampling the posterior


#using mcmc_equilibrate (as in Uncertain network reconstruction from graph_tool documentation)
u = None              # marginal posterior edge probabilities
bs = []               # partitions
cs = []               # average local clustering coefficient

def collect_marginals(s):
   global u, bs, cs
   u = s.collect_marginal(u)
   bstate = s.get_block_state()
   bs.append(bstate.levels[0].b.a.copy())
   cs.append(gt.local_clustering(s.get_graph()).fa.mean())

#gt.inference.mcmc_equilibrate(state, force_niter=10000, mcmc_args=dict(niter=10), callback=collect_marginals)
gt.inference.mcmc_equilibrate(state, force_niter=10000, mcmc_args=dict(niter=10, beta=1), callback=collect_marginals)
# in both cases it returns ValueError: not enough values to unpack (expected 3, got 2)