I was under the impression that passing a list corresponds to getting the probability that *all* the edges are missing. Indeed, when I try it out I get back a scalar not a np array. I want to collect the probability that each individual edge is missing.

Also, with respect to the heuristics I mention, I just saw this paper "Evaluating Overfit and Underfit in Models of Network Community Structure" use "s_ij = θ_i *θ_ j* l_gi,gj"

If sampling is not computationally feasible, this is what I had in mind.

1) Is there a way built into graph-tool to compute this similarity function efficiently? (i.e., without Python slowing me down)

2) Is there a hierarchical analog, like just summing this similarity at each level?

Thanks as always

On Mon, Mar 23, 2020 at 10:13 AM Tiago de Paula Peixoto <tiago@skewed.de> wrote:

Am 19.03.20 um 18:33 schrieb Deklan Webster:
> I'm attempting to use get_edges_prob to find the most likely missing
> edges out of every possible non-edge. I know every possible edge is O(n^2).
>
> Currently I'm sampling the like this:
>
> non_edges_probs = [[] for _ in range(len(non_edges))]
>
> def collect_edge_probs(s):
> s = s.levels[0]
>
> for i, non_edge in enumerate(non_edges):
> p = s.get_edges_prob([non_edge], [],
> entropy_args=dict(partition_dl=False))
> non_edges_probs[i].append(p)
>
> gt.mcmc_equilibrate(nested_state,
> force_niter=100,
> mcmc_args=dict(niter=10),
> callback=collect_edge_probs,
> verbose=True)
>
> Is there a way to speed this up at all? If not, is there a heuristic I
> can use to reduce the number of possibilities?

There is no way to avoid looking at all possibilities, but you could
pass the actual list at once, instead of iterating through it and
passing lists of size 1. The reason get_edges_prob() exists and accepts
lists is precisely to speed things up in this case.

Best,
Tiago

--
Tiago de Paula Peixoto <tiago@skewed.de>

_______________________________________________
graph-tool mailing list
graph-tool@skewed.de
https://lists.skewed.de/mailman/listinfo/graph-tool