There is no way to avoid looking at all possibilities, but you could
pass the actual list at once, instead of iterating through it and
passing lists of size 1. The reason get_edges_prob() exists and accepts
lists is precisely to speed things up in this case.
I was under the impression that passing a list corresponds to getting the
probability that *all* the edges are missing. Indeed, when I try it out I
get back a scalar not a np array. I want to collect the probability that
each individual edge is missing.
Also, with respect to the heuristics I mention, I just saw this paper
"Evaluating Overfit and Underfit in Models of Network Community Structure"
use "s_ij = θ_i *θ_ j* l_gi,gj"
If sampling is not computationally feasible, this is what I had in mind.
1) Is there a way built into graph-tool to compute this similarity function
efficiently? (i.e., without Python slowing me down)
2) Is there a hierarchical analog, like just summing this similarity at
each level?
I was under the impression that passing a list corresponds to getting
the probability that *all* the edges are missing. Indeed, when I try it
out I get back a scalar not a np array. I want to collect the
probability that each individual edge is missing.
Yes, this is true.
Also, with respect to the heuristics I mention, I just saw this paper
"Evaluating Overfit and Underfit in Models of Network Community
Structure" use "s_ij = θ_i *θ_ j* l_gi,gj"
This is not substantially faster that what is actually computed in
graph-tool, it is just less accurate.
If sampling is not computationally feasible, this is what I had in mind.
1) Is there a way built into graph-tool to compute this similarity
function efficiently? (i.e., without Python slowing me down)